MIT/LCS/TR-237 
TOWARDS A THEORY 
FOR 
ABSTRACT DATA TYPES 


Deepak Kapur 


This blank page was inserted to preserve pagination. 


TOWARDS A THEORY FOR ABSTRACT DATA TYPES 


DEEPAK KAPUR 


Copyright Massachusetts Institute of Technology 1980 


May 1980 


This research was supported in part by the Advanced Research Projects Agency of the 
Department of Defense, monitored by the Office of Naval Research under contract 
N00014-75-C-0661, and in part by the National Science Foundation under grant 
MCS'‘74-21892°A01. 


Massachusetts Institute of Technology 
Laboratory for Computer Science 


Cambridge Massachusetts 02139 


This empty page was substituted for a 
blank page tn the original document. 


-2- 
Towards. a Theory for Abstract Data Types 
Abstract 


A rigorous framework for studying. immutable data -types. having nondeterministic 
operations and operations exhibiting exceptional behavior is:developed.- The framework 
embodies the view of a data type taken in: ie saiaaikd ees and ‘supports 
hietarchical and modular'structure ae aries Ore : 


The central notion in this unework is the definition: of a ‘cath type... An algebraic. and 
behavioral approach for defining a data type is developed which focuses on the 
input-output behavior of a data type as observed through its operations. The definition of 
a data type abstracts from the representational-steactuse: af:és: values aswell as from the 
multiple representations of the values me bales conesbacaicas structure. 


A hierarchical specification Gigiag for is — is proposed. The semantics of a 
specification is a set of related data types whose operations have the: behavior captured by 
the specification..'A Clear distinction is made between a data:type and its specification(s). 
The normal. behavior “and. the exceptional ‘behavior of the ‘operations are specified 
separately. The specification language: provitlés mechanisms to’specify (i) a precondition 
for an operation thus stating its intended inputs, (fi) the'exceptions which must be signalled 
by the operations, and (iii) the exceptions which the operations can optionally signal. Two 
properties of a specification, consistency and behavioral completeness, are defined. A 
consistent specification is guaranteed to specify at least one data type. A behaviorally 
complete specification ‘completely’ specifies the observable behavior of the operations on 
their intended inputs. 


A deductive system based on first aed multi-sorted predicate calculus with identity is 
developed for abstract data types. It embodies the general properties of data types, which 
are not explicitly-stated in a specification. The theory of a data type, which consists of a 
subset of the first order properties of the data type, is constructed from its specification. 
The theory is used in verifying programs and designs expressed using the data type. Two 
properties of a specification, well definedness and completeness, are defined based on what 
-can be proved from it using different fragments of the deductive system. The sufficient 
completeness property of Guttag: and Homitg is also formalized and telated to the 
behavioral completeness. property. ‘The welt definedness property. is stronger than the 
consistency property, because the well definedniess property not only requires that the 
specification specifies at least one data type, but also captures the intuition that it preserves 
other specifications used in it thus ensuring modular structure among specifications. The 


~3- 


completeness property is stronger than the sufficient: completeness. property, since in 
addition to the requirement that the behavior of the observers can be deduced on any 
intended input by equational reasoning, it also requires that the equivalence of the 
observable effect of the constructors can be:deduced from the specification by equational 
reasoning. 


__A cortectness criterion is proposed for an implementation. coded in a programming 
_ language with respect to a specification. It is defined as a relation between the semantics of 
an implementation and the semantics of a specification. It does not require a correct 
implementation to have the maximum amount of nondeterminism specified by a 
specification. A methodology for proving correctness of an implementation is developed 
which embodies the correctness criterion. 


Name and Title of Thesis Supervisor: Barbara H. Liskov . 
Associate Professor of Electrical Engineering 
and Computer Science 


Key Words and Phrases: Abstract Data Type, Data Type, Data Abstraction, Type Algebras, 
Nondeterminism, Exceptions, Specification Language, Semantics, 
Consistency, Behavioral: Completeness, Deductive System, 
Verification, Proof Technique, Sufficient Completeness, 
. Completeness, Well Definedness, a acca 


This report is a minor revision of a thesis of the same title submitted to the Department of 
' Electrical Engineering and Computer Science in March, “80 in as fulfillment of the | 
requirements for the degree of Doctor of Philosophy. 


-4- 


Acknowledgments 


1 am thankful to my thesis supervisor, Professor Barbara Liskov, for her patience and 
encouragement during the thesis research and especially during the later stages; to 
Professor John Guttag for posing many challenges and for many suggestions leading to 
improvements in the presentation of the thesis; to Professor Carl Hewitt for helping me 
organize and present my ideas in the early stage of the research; and to Professor Hal 
Abelson for diligently reading the final draft and making many helpful comments. 


My officemates, Valdis Berzins, Srivas Mandayam, and Carl Seaquist have helped me in 
many ways during the thesis research. They gave me an audience whenever | needed, 

helped me organize my ideas, and found time to read my work whenever | asked them 

irrespective of their other important responsibilities. Carl and Srivas provided a very 

stimulating and encouraging atmosphere during the last year. | am also thankful to Russ 

Atkinson, Moms Krishnamurthy, Dave Musser, Gene Stark, and Jeannette Wing for their 

helpful comments. Eliot Moss is to be thanked for producing and maintaining the software 

necessary for the production of this document. 


The graduate study at MIT has provided me a unique opportunity to live outside of my 
own country which has been a tremendous learning experience. Besides computer science, 
I have learnt a great deal about life, this country, my country, and myself, which has 
fundamentally changed my attitude and outlook towards life. For this, ] am indebted to 
the students and staff of the Seminar on International Students and Their Participation in 
Development, and my friends, especially Arvind, Ashok, Carl, Kanchan, Krishna, 
Mukundan, Nagu, Ravi, Rashid, Sekhar, Srivas, Vaqar, and Vinod. Without their 
encouragement and interest, continuing the thesis research would not have been possible. 
Roli has contributed to the Completion of the thesis in her own unique way; in no way can I 
adequately express my gratitude to her. 


This research was supported in part by the Advanced Research Projects Agency of the 
Department .of Defense, monitored by the Office of Naval Research under contract 
N00014-75-C-0661, and in part by the National Science Foundation under grant 
MCS‘74-21892°A01. 


This empty page was substituted for a 
blank page tn the original document. 


1. Introduction esesaaean SCSCOTEUsanseDensssenseeasesesene e@eonsseupees ween . 9. 
1. Scope and Approach.of the Thasis.. ani Gas wanaenoeudgaecnasioaguvar UW 
1. Scope and Assumptions steneacevequensansgesvecnanaeucasessseqerssesenesnaetsens 11. 

2. Definition of a Data Type ...........0..cce pecheiess eo tueescanaveesusiues 11. 

3. Specification MethGd':....... ocdivbsecescicatevosscadsuNwees isd duesvescneesuacoedas 13. 

4. Deductive System ....::............cccccecsccees daiasecsivlassdédedesssuecedascdeces 17. 

5. Correctness of implementation .:;.:.............0 bteebdadeevetereevtecsereees 18. 

2. Related Work ........5..00i cee cceclecesceneeeces ere rere 19 
3. Outline of the meee iGacudate na lomasouls dancdavnccucvescaccelecaceyestas 22. 
2. Definition of an Abstract Data Type eee. ieee et 23. 
r Informal Description of aData: Type. eee eaisieiae 26 
1. Terminology SEs oneP PP a ppscesssaneoneee SPOKES aSESES ASRS SESE SHE EMSCONASS bee @eusees . 26. 

2. Hierarchical Structure ........ ponnedeociuiabaceiecetsenad uqetornarsenssacar eres «a 28. 

3. Minimality Property ........ wegewe pen bnine de vapanapanviuah sae cnhencieneccereccecccesene 29. 

2. Formalism ..............000 raoevsinnnintiiarare 31. 
1. Type Algebras ......c.cperencpseestesoenntvecnnnnegueteneppprennetassntarssenstsscsses BQ 

2.. Examples of Type Ataebrats esdcateteeatas etevnteseseeceeenesees eessesonseesesens 35. 

3. Interpretation Of Terrie |... 5.5. cl iiedecseiids cnadesdeccivececcuascdecencencesnses Ole 

4. Observable Behavior ................:ccccscccsncnsccenccacersreccncsstecsnsecesenes 39. 

1. Definitions of Observable Equivalence. ang. : . ‘ 

Distinguishability -..........c..cc i ecccsssesssseceuecens eaeueabetecaseesesgaawe 41. 

(2. Reduced Algebras ssaesenaseenesenensenenenansnsenasaetaterens ngteqereesaneeees 45. 

‘5. Behaviersl Equivalence of. Type Algebras. easmppS gohan cqncavyevonabocnane 45. 

6. Definition of a Data Type sean ee nae pe ge odie Pes secvcresacscacactecscgcocses eesecus aeeeee 49 

7. Observable Equivalence and Penee eae! of Te Ce 51. 

3. Exceptional Behavior of a Data Type - sipdanepscaieenteuidesitess Op 
1. Assumptions about Exception Handling Mechanism | shinnaieadesaasios 53. 

2. Formalism ..........cccccsecscscccssscensesercnes soepseepenapystoceceyacanenaseccossenees 56. 

1. Terms, Exception Terms, and inte msetations ...............0 57. 

2. Examples of Modified Type Algahyyas, .......0:-csccccsesseceerseseeens 58. 

3. Observable Behavior and Distingwishability esc etadeeusteesvceuceee 59. 
4. Comparison with Goguen’s Appreach . pennanaegnecsacecvarensnseeneecee 62. 

3. A Simpler Approae®tt .............ccsccceeceees Sewbesdsbindatuadsnddacesasnccbicseoies 63. 


Table of Contents. . 


4. Mutually Recursive Data Types . ssenseinstaatess te ciaat iiethaeass 66. 


ae 


3. Specification of an Abstract Data Type ............... 68. 
1. Specification Language ...............scsss000 Sesaduscasaedeuceteut 72. 
1. Operations ............cccce gi tesueaos Hien sameonatateascaetevice 73. 
2. Auxiliary Fumctianng : oi ..icies..ce.ccceeeceabidecnepecssccesesseceeceecessesessnrenees 74. 
3. ROStriCtiOns.. ........ccccccseccsccccrscssscccnsessseusescsevcecvscesecasassccscensconsces 77. 
1. PreCONGitiONS ...........cccccscesscncsssscnscenssccncneceenscouceccancasensenees 77. 
2. Exception Conditions ............cccsccssccsccssccecccccccescesssesscessssonces 79. 
3... DISCUSSION cis ccdicscscceiecie eck ie Seecciens o Suc scesastscteieitetiwescecsaeees 80. 
ES -MRNOING secsi cece ooi asd readin Lasesieenat usec e Bee eed aeons 81. 
5. Specifying Nondeterministic Operations .............cccccesceeeeensseeees 83. 
6. Specification of Mutuaffy Rectirsive Dee ieee dsesceseve bile ctacdates 85. 
2. Semantics of Specification Languioe Sei rece tees asteaseaae 86. 
1. Specifications without Auxiliary Functions . esuebestaeesscaccasecss wee. 87. 
1. ReESEFIGLIONS ..,.........:cccececreneneesereeces cckdediiedunauieuacapasvaesesseasess. eee 
2. AXIOMS ..........cccceccecees dapnbls Unebusdedd casosudeghesdcusnaiaenensaactecteesvay- OWs 
2. Specifications with Auxiliary FUNCHIONS ......c0..seresecserecreeres OT. 
3. Semantics of a Specification ............. pecenverenssaneescesecanseecescnesecns 92. 
3. Specification of a Data Type and i 
Equivalence of Specificationg. .....60.5..,c.cssereeereee waeeaesis 94. 
4. Specification of Bool ................cccccesscsencceesecececececersnees 98. 
5.. Properties of.a Speerceerety iphgsSacddasnlace cs susebeeteracsoesans 99. 
DL Consistency. .........cccccssereneseteressnnens seeneesnendeeavecnsenesennenersaseses 99. 
2. Behavioral Completeness... siesvaccrdbiscdccsucsdicewateccesscvsssedaccicense ADs 
1. Partial lsomorphic. Equivalence ounecesess eececcducseneavcecssseseesccens 103. 
2. ftsomorphic Embeddablillity .............ccccscsssecceccecscecssscnscssesecs 104. 
3. Partial isomorphic Embeddability ........... eosuuedensbaneaussanatoas 106. 
4. Definition of Behavioral Completeness actnelasauiaieesorl sxaeeteusad 106. 
6. Comparison With Related WOrkS. ........:s-.s-<ese--00:----- 109. 
4. Deductive System ............. eeuee assis soseeect 112. 
4.) Preliminaries ............::..ceccccssesccsessscssessesssseescsessssasses 115. 
2. Theory of Data Types without Nondeterminism and 
without Exceptional Behavior astgitesenatheiaenascaes ecaaneuiaea 119. 
1. Derivation of Nonlogical Axioms ............... geoweccgeecoonqcnssessosnssens 121. 
2. Equational Subtheory ................c00 = sdu seuntos Savanvaaavedaesshtsuss sixteees 122. 
3. Distinguishabitity Subtheory ...................ccecsssscceceeneeeeeres coseeee 123. 
4. inductive Subtheory .............-..csssccsesseseeseeee nests sp idbacadensevunbes 124. 
1. Infinite Induction Rute ................ccelecccscsccsnsesseneccestecesesesees 125. 
2. Rationale for an infinite Induction’ Rute cialicga cokactaaaeubeusdoudswes 126 
3. Use of the Induction Rule... cee ceeeeee shaesnassawe sesease 128. 
4. Specifications with’ Nontrivial Preconditions , 
for COMSEFUCEOLS. ......,.c0ncsesreeseceeeesie aecvenencevsccasceseecnessesessoes 131. 
5. The Full Theory ............ccccessessee ies eeeedenes a cee naneceeresenensescersscees 134. 


6. Properties of a Specification ...........scscsscssseseseesseeseseseeneneeees 136. 


Boy ee 

1. Sufficient Completeness ewe eer ec ected ape tscceceedtedencncvecedeeneansees 138. 

2. Completeness ................ pyevecesencorsocesssseceneessseees daslgednetbesaees 141. 
3. Well Definedness ............ sesbeencenerssensseneceevateeeseenesaeenes secee 142, 

7. Automation of IND(S} ..........4 capac eames etdnciak 143. 
3. Theory of Exceptions Without Nondeterminism Stee aces 144. 
1. Derivation of Nonlogical Axioms. ....-......c:.sssccidieaseesesceesceseeces 148, 
1. Restrictions Component ..............:ccccscseccescetoeceneeenecsoseseunse 145. 

2. Axioms Component. ..........:2.c.cccsccessecenseeccnedanstencenessencnees 146. 

3. Definition of N?p: OP ec eveceneenerne cocbetengannecegaanereseddapapincse sabe peeve : 447. 

2. Equational Subtheory .............cscccssccsssnsccecencecscnenssserscesenseecnacs 149. 
3. Distinguishability Subtheory ............... peeevaane eesseeerepsenscccepancoey, AGE, 
4. Inductive Subtheory ..0..........ccceccsecssescsscsnncecesceeecsceanseseenssenee 182. 
5. The Fuit Theary- “wo enes cece energ tenes ces bepedansbeconcctausdapnqeauseeroesesessneusece 153. 
6. Properties of a Specification ...........cccccsescescssscsecsssere cadiseled cose 157. 
1. Sufficient Completeness ..............ccccsececcesecenecccsrensccenacsenes 158. 

2. Completeness arid Welt Detinedivese | ...0..:is.5.5...cicceesecedeeeee 160. 

4. Theory of Nondeterminism .......5.ccicccdecvecdeciiccsseesaense. 161. 
1. Transformation Procedure TR. ................ Vive boduccenseecsbaciecesseveess 163, 
Do TUS) eaevasties ceceuscvens tactecasttacdesteaiteevesseuecs gerenenscsepanssqensooes sbeadeee 167. 
3. Data Types with Exceptional Behavior. Paps Ge okete waecyacoon seseeese 168, 
4. Properties ofa. Speciticatieon pecdb des duccbacenapcadienesviée ences eneeessueneees 173. 


5. Strong Equivalence of Specifications ..............0.. 175. 


e 


5. Correctness of Implementation. .................2...-. 176. 


1. Correctness Criterion and. 
Overview of Correctness Method ..0.........:.00000...-0066.. 178. 
1. Semantics of an implementation «.........-cccessieenees ieeaases advaiiee . 179. 
2. Correctness Method ..........::ccccssssccceresessesseenees ded edcaccesvesesaavave 181. 
1. Nondeterminisnt ................cccsscessscensessseees scuecwasesicaccvisccdaues 182. 
2. Definition of Correctness. .........c.scscssrsenenccsersesccsscecensseasseces 185. 
2. Implementation Structure and Semantics ................- 187. 
1. Procedures - Approach L .............ccccsssecnccsscesceccnccnsonsuvecssncnseees 188. 
2. Procedures - Approach Ih .............ccscssncecesscconcscecescerccsensceanreres 188. — 
3. Properties of the Encapsulation Mechanism ................c:seccseess 191. 
4. Semantics of an Implementation .............ccccccccssccercreecscccetenses 195. 
3. Correctness Method .................sseneceeee ilobeuadcatabawedeas 196. 
1. Auxiliary Functions in a Specification ..............ccccscsscssscccneseses 196. 
2. Preservation Of Inv ...........cccccscseececerceceeccsceccecevconsscocenecesenessors 196. 
3. Termination of Procedures ..........ccssscececscnccsssecsnaccscusenceceenceses 197. 
4. Proving Restrictions and Axioms ............ccsccccsssrscscscncnsecsceneeees 197. 
.1. Preservation of Equivalence Relation .......:...... eectadsessncsencee 198. 
2.. ROStTICVIONS : 2.0 cccccicecicicaes saiscewes cdivecsceccanscdeccncocasccesacnesecsenses 199. 
3. AXIOMS .........cccscsecesecersnsneeens gcue destantivdene secedseseeics Gevintedeeces 201. 


Nondeterministic Procedures ........ccsssnnccconccccscsccssececssescsvensses 202. 


ae 


6. ~Pseudo-Nondeterministic Procedures ......:.......:ccccsessesesseeeees 203. 


4. Recursive and Mutually Recu rsive Jmplementations .. 205. 

1. Recursive implementations senenasceroguseunsessenseseccosesenccosecaecsarerar 205. 

2. Mutually Recursive Imptementations . meee ta baa paueu es func vgaweeeennee 209. 

6. Conclusions ........... wevesauees chews vansbsxecnrtestanscueawees 210. 
1. Summary of Contributions ..... Se eer ee er 210. 
2. Directions for Further Research ...:............. ical neds 212. 
References ........... a ee ee Sees Bilscees ee eee .. 216. 


Appendix |. Elaboration of Scope and Assumptions 224. 


1. Immutable and Mutable Data Types supasvasdhuneavetsalenaecns 224. 
2. Exceptional Behavior ........ desetthevs jceteades wiecieecite Wevabenscccde 225. 
3. NONGetSrminriGiM .........cccscccevevictvevescnccnsnansccccnanesecconccs 225. 
Appendix Il. Definitions of Algebraic. Goncents: and | 
Proofs of insoren* nw Chepter Be dscinds 227. 
1. Congruence, tiamatbhiain: and 'somorphiem jcspenees 227. 
2. Proof of Theorem 2.2 v...c.ccssscsellictiescesesnstissteeseiseeees 229. 

3. Elaboration of the Definitiawef Gehavioral Equivalence 
and Proofs of Theorems 2.5 and: 2.6: cnc... nce cvenececnseeneees 230. 
Appendix til. Proofs of Théorems tn Chapter 4 sabats 236. 


eprendix IV. Specifications of Data a Types used in 
; Chapter 5 ..... uli bgcaeitinsiewavitvueacssen ses 246. 


1. Introduction 


The role of abstraction, modularity and hierarchical structure has been well 
recognized in the literature on program design and construction [12, 66, 73]. Data 
abstraction, in particular, has been found to be a useful abstraction mechanism in the 
design and construction of well structured programs (sy. Most of the recent 
programming languages encourage the use of abstract data types by providing an 
encapsulation mechanism for implementing them [65, 49, 52, 75, 45, 1]. It is necessary to 
develop a rigorous foundation for abstract data types so that the informal concept of an 
abstract data type can be placed on a firm and sound basis, and various aspects of this 
concept can be studied and analyzed. 

In this thesis, we develop a framework for abstract data types. The central notion 
in this framework is the definition of an abstract data type. We develop a behavioral 
method for defining a class of abstract data types, called immutable data types [49, 52]. An 
immutable data type is defined as a set of behaviorally equivalent algebras having 
interpretations for the values and the operations of the data type. Behaviorally equivalent 
algebras have the same behavior as observed through their operations. We propose a 
specification language for abstract data types. The semantics of a specification i is a set of 
related data types sharing the common behavior captured by the specification. We make a 
clear distinction between a data type and its specification(s). We develop a deductive 
system for abstract data types embodying their general properties which are not explicitly 
stated in a specification. We use the deductive system to prove properties of an abstract 
data type from its specification. We propose a correctness criterion for an implementation 
of an abstract data type with respect to its specification, and develop a methodology for 
proving correctness of an implementation with respect to a specification which embodies 


the proposed criterion. 


1. The terms abstract data type, data type, data abstraction, and type are used synonymously in this thesis. 
2. Liskov and Zilles [47] emphasize the nced for rigorously developing the mathematical foundation of the 
specification methods for abstract data types. 


ae 


The main contribution of this research is a framework for absteact data types that 
is rigorous and that brings together various aspects of abstract data types in a unified and 
coherent way. Our approach is better than other similar atteinpts, in particular the initial 
algebra formalism of the ADI group [23] and the category theory formatism of Goguen 
[20, 7, 30], because it is more in tune with the way programining Janguages support the 
mechanism of abstract data type. The framework incorporates’ ithportant and useful 
features such as hierarchical structure and modularity: It i§ also’broader in scope: as it 
handles data types with nondeterministic operations ‘and with ‘operations exhibiting 
exceptional behavior. We had originally ‘developed the framework without considering 
nondeterminism and exceptional. behavior: however, we did not encounter any major 
difficulties in extending it to incorporate nondeterminism ‘atid exceptiotiat behavior. This 
makes us believe that our framework is robust and’ extensible for studying other aspects of 
data type behavior not discussed in this thesis. 

Our framework will be useful to a designer of a specification language for abstract 
data types as it provides a semantic basis for studying and: comparing such specification 
languages. It can be used to define the semantics of 4 ‘spécification language: It also 
provides a formal basis of automatic deductive systems for abstract data types, such-as 
AFFIRM [60]. It suggests an approach for studying and extending the method of 
reasoning about data types developed in the thesis. Other methods of reasoning can also be 
developed using it. Furthermore, this research clarifies our intuitions about data type 
behavior and provides a formal basis for them; as examples, the notions of consistency and 
sufficient completeness advocated by Guttag ‘and ‘Horning [28], and’ the correctriess 
criterion for an implementation [29, 40] can be stated formally anid analyzed. aa. 

Our research has been highly influeniced by'Peano’s method of defining natural 
numbers and McCarthy's method of defining S-expressions {57}. ‘Weare intellectually 
indebted to Zilles [77] and the ADJ group [23], for their work on the algebraic approach for 
“abstract data types, and to Guttag et al. [25, 28,29] for their work on specification 
technique for abstract data types which emphasizes programmers’ intuitions about data 
types. We cite other related works in Section 1.2, and state how we e plan to compare these 
works with that discussed in the thesis. 


-ll- 


1.1.Scape and Approach of the Thesis. 


We first state the scope of the thesis and the assiimptions made about the data 
type behavior. The. scope and assumptions : are further discussed i in “Appendix 1. Later, we 
give an overview of the approach taken in studying four issues, ; namely, definition, 


rizt 


specification, deductive system, and implementation correctness. 
1.1.1 Scope and Assuniptions 


In our research, we have ‘considered immutable data types having 
nondeterministic operations and operations exhibiting: exceptional behavior. Every. 
operation is assumed to be total and computable; see 442} for: a:precise characterization of 
computability on the values of a data type. It terminates on every input in its domain either 
normally. by returning a value of its range type or by ‘signalling an exception. A 
nondeterministic operation has only finitely. many choices on an input. If a 
nondeterministic operation signals on an input, it is assumed to behave deterministically on 
that input. So, it does-not have a choice between signalling and terminating normally on a 
particular input. Henceforth; by a data type, we mean: ah ‘immutable data type with the 
above behavior, and by an object, we mean aarti aaa 


1.1.2 Definition of a Data Type - 


. Our formalism for defining a data type is algebraic in the style of Zilles v7 and 
the ADJ group [23}. Algebras are a natural and elegant way to define an immutable data 
type, because an immutable data type is informally a set of values and a set of operations. 
In a Programming language supporting data types, the most important aspect of a data type 
to its designer as well as its user is the input output behavior of its operations (37, 47, 5). 
The values of a data type are manipulated only by its operations. Outside its 
implementation module{s), the values are viewed abstractly as sequences of operations. 
The details about the representations of values and the operations of a data type are of no 


-12- 


relevance.> To a user, two distinct representations are-behaviorally: identical #f they cannot 
be distinguished by the operations of the data type. We call this view the behavioral view . 
of a data type. The behavioral view abstracts from the representational structure of the 
values as well as from the multiple representations. ‘OF a value for any representational 
structure. It is a further abstraction on the view of a data type adopted by ADJ [23] and 
Zilles [77] which abstracts only from the representational : structure of the values. 

In a programming language supporting modularity and hierarchical structure 
| such as CLU, EUCLID, etc., data types are implemented hierarchically one at a time 
except that mutually recursive data types are implemented.together, as a group; data types 
other than those being implemented are assumed. to be implemented elsewhere.‘ We take 
the same approach in defining a data type. Our definitional method is hierarchical. We 
distinguish between the data type(s): being defined. and other data types used in the 
definition. We call the data type(s) being defined the defined type(s) and other: data types 
in the definition the defining types. The distinction between the defined type and defining 
types is significant because the behavior of the values of tte. defined type is observed by the 
operations which return the values of the defining types. This was.first pointed out by 
Guttag [25], and is the basis of his definition of the. fficient. completeness property.. We 
use the data type boolean, which is seH-contained and does pet have-any defining types, as 
_ the basis of our definitional method. We assume its definition and that all boolean values 
are distinguishable. In fact, any data type whose viallues can be distinguished a priori 
(outside the formalism) can be used as the basis. For example, any data type directly 
supported in a programming language whose values: are distinguishable using the literal 
(constant naming) mechanism in the programming language i is a suitable candidate. | 

We classify the operations ofa data type into two categories - the constructors, 
which construct the values of the data type, and the observers which return the values of 


- 3. We will not be concerned about other issues, such : as efficiency of the operations, etc., relevant to a user eon 
a data type. Our formalism is limited in this sense. ; 

4, Mutually recursive data types are different from mutually recursive implementations; sce Chapter 5 for a 
detailed discussion. 


-B- 


the defining types. A value of.a data type manifests its behavior through the observers with 
the help of constructors. - - a ee a 
Our approach for modeling the exceptional behavior embodies a practical view of 
exceptions. Each exception is named, and can have arguments that carry information to its 
handler from the. place where it is, signalled... The exceptional behavior of the operations 
can also be used to distinguish among: different. values. Am .operation can distinguish 
between two values by signalling .on one. value. and terminating. normally on the other 
value, or by signalling different exceptions on different values. _— 
| The model used for. nondeterminism is simple. If a nondeterministic operation 
behaves nondeterministically on an input (i.¢., it. has.a choice-to return one of the many 
possible results), we expect it to return every possible result. We do not consider how these 
results are scheduled by an implementation of the operation. -Two operations having 
different amounts of nondeterminism are considered to have.different observable behavior 
~ because for some input, they:.can always return. distinguishable. results. Data types. with 
operations having different amounts of nondeterminism are thus considered different. For 
example, consider a data type finite set of integers with a nondeterministic operation 
Choose which nondeterministically picks an arbitrary-element from a nonempty finite set 
of integers given as an argument. This data type is different from another similar data type 
_ with the same set of operations which also have the same behavior with the exception of 
Choose which is deterministic and returns the maximum integer. of a nonempty set. 
Furthermore, both data types are different from yet a thisd.data type with the same set of 
operations .as the other. two types except that Choose has a limited amount of 
nondeterminism: Cheose nondeterministically picks between the maximum and minimum 


integers from a nonempty set. 
1 4 3 Specification Method 


A specification is mainly used, among other things, for reasoning about a data 
type. So, our specification method is axiomatic in the style of Standish [69], Hoare [38, 39], 
Guttag [26, 29], Nakajima et al. [62], etc. A specification embodies information hiding [66], 
Le., it only specifies the behavior of a data type. Our specification method is hierarchical. 


-}4- 


Data types are specified incrementally, one at a time; a specification uses the specifications 


_ of other data types. We believe that specifications should be modular and well structured 


just like programs; otherwise, specifications of large problems become ‘unmanageable and 
difficult to understand.> 

A specification expresses the properties particular to the data type(s) being 
specified. It specifies (i) the domain, range, afd the exceptions with the types of their 
arguments, if any, signalled by every operation, (ii) the normal behavior as well as: the 
exceptional behavior of the operations. The general properties of data types which hold for . 
every data type, for example, the minimality properly which requires that every value of a 
data type is constructed by finitely many eer of its constructors, are not included | in 
a specification. . — 

The normal behavior of the operations is specified as a restricted set of formulas 
of first order multi-sorted predicate calculus with identity. A typical formula is a 


‘conditional equation relating different sequences of operations under a condition. A 


specification can use a finite set of auxiliary functions so that any data type with a finite set 
of total deterministic eomutenle cortices can a be ooo in this -way 43} if 
properties of its possible results on an input, rather than by explicitly specifying its relation 
which holds for-all possible results of the operation andthe input-and does not hold for any 
other value and the input. For example, in case of the data type finite set of integers, the 
nondeterministic operation Choose is specified by relating itsipossible results to its set 
argument, instead of explicitly specifying its relation Choose_p : Set-int x Int --> Bool 
which holds for a set and an integer if and only if Choose ¢an retum the integer when 

applied on the set. | 
The exceptional behavior of the operations is specified as a separate layer on top 


of the normal behavior. Following Guttag [31], if an operation signals an exception, we 


5. Burstall and Goguen [7] and Nakajima ct al. [62] also emphasize:the need for structured specifications. 


-45- 


specify the condition on its mput under which : the exception _ is: signalied.© The 
specification language provides mechanisms ‘to. specify the exceptions. which must be 
signalled by the operations as well as the exceptions which the operations can optionally 
signal. The specification also allows.a precondition on an operation to be specified, stating 
that the behavior of the operation on inputs not satisfying the. precondition is. not. of any 
interest. A formula expressing the normal behavior of the operations holds only if the 
input to the operations in the formula satisfy the: specified: preconditions and. if the 

operations do not signal; it thus has.a restricted interpretation. A formula specifying the 
| normal behavior is called an axiom. The preconditions and the exceptional behavior of-the 
operations is specified using restrictions. 

Our. approach of specifying data types is. thus different. from those of Zilles 7) . 
and the ADJ group [23]. In their approaches, a specification of a'data type is a finite.set.of 
identities (or conditional identities) presenting the set of algebras serving as the definition 
of a data type. These identities are interpreted exactly the same way as in Universal 
Algebra [4,10]. We are also not.constrained to. employ anly "“equational” reasoning; 
instead, our reasoning method embodies the general: properties of data types as is discussed 
later. 

The semantics of a properly designed specification is a set of related data types 
which differ in the behavior intentionally not captured by the specification. If an operation 
is specified to be nondeterministic, the semantics of a specification includes-data types in 
which that operation can have as much nondeterminism as desired insofar as the opetation 
behavior satisfies the axioms and restrictions expressed in ‘the specification. We define 
equivalence among specifications. We: also state when a data type can be (precisely) 
specified in the proposed specification language. We define two important properties of a — 
specification: The consistency property, which states whether a specification specifies any 
data type: the behavioral completeness property, which guarantees: that the observable 
behavior of the operations is not left unintentionally unspecified. These properties ensure 


6. However, this way of specifying the exceptional behavior of the operations may be overly restrictive, as for 
an opcration, the subsct of inputs on which it signals a particular exception:may be very complex to specify. 


= 16. 


that various components of a specification: have the desired structure: Checking for these 
properties. is a step. towards nae that the re pe me intuition of.a 
designer. 
In our research, a clear distinction is made ‘between a data type and its 
specification. In. most of the literature on: specification techniques for data ‘types 
{47, 25, 28, 29, 61, 77, 48, 37], this distinction is.either not made. or blurred if it is implied. 
Most of the literature does not explicitly define what:a data type is. The ADJ group [23] 
was the first to our knowledge to explicitly state in their formalism a definition of a data 
type and make this distinction. We believe the-distinction between. a data type and its 
specification is useful and necessary in a formal treatment of data types. Given a definition 
‘of a data type, different specification. techniques can be developed to.serve different 
purposes, if needed, and their semantics can be-given-in terms of data types. Different 
methods of reasoning about. a datatype can be developed .incorporating the general 
properties of data types with the definition of a data type serving as their basis. The 
question of whether a given data type can be specified using a particular specification 
technique can arise only when this distinction is made; only then can.different specification 
techniques be compared in their expressive power. Only then it is meaningful: to discuss 
the properties of a specification ‘technique. such .as: the 2ease of expression, 
comprehensibility, minimality, etc., [47]. (See [34] for a similar discussion for programs.) 

A specification plays.an important role in our research. Itis used as a standard for 
checking the correctness of an implementation as well as-for deriving properties of the-data 
types specified as is discussed in the next two. subsections. it is an interface between the 
programs using the data type and the program(s) implementing the data type. The 
specifications of abstract data types are a..major component-of a program verification 
system. Our specification method can. be used to specify the behavior of the data 
component of software designs; questions and inquiries about the data in a design can be 
expressed and analyzed using the deductive system discussed in the next subsection. (See 
the two survey papers on specification methods [47, 48], where the.need for writing formal 
specifications is discussed. Guttag and Horning [32] discuss the importance of formal 
specifications as a design tool.) | ree | | 


-l7- 


1.1.4 Deductive System 


As was stated earlier, one of the main reasons for designing a specification is to 
have an implementation independent description of the data type that can be used to 
reason about the data type as well ‘as to reason about the designs and programs using the 
data type. We propose a deductive system based on first order multisorted predicate 
calculus with identity for deriving properties of a data type from its specification. The 
deductive system embodies the general properties of data types which are not explicitly 
stated in a specification but assumed in its semantics. These properties are derived from . 
the syntactic structure of the operations. . 

The deductive system has an infinite rule which captures the minimality property 
of data types. The deductive system is powerful enough to prove inequalities. We 
axiomatize the general properties of the exceptional behavior of the operations. Properties 
expressed using nondeterministic operations can be proved. We construct a theory of a 
data type, which is a large subset of its first order properties, from its specification. If a 
specification specifies a set of related data types, every theorem in the theory constructed 
from the specification holds for each data type in the set. 

We define three other structural properties of a specification, namely, sufficient 
completeness, well definedness, and completeness, based on what properties of a data type 
can be deduced from its specification using different fragments of the deductive system. 
We precisely state the sufficient completeness property defined by Guttag and Horning 
[28] for a restricted set of specifications and extend it to specifications in our specification | 
language. This property requires that the behavior of the observers on their intended 
inputs can be completely determined from the specification by purely equational 
reasoning. We relate this property to the behavioral completeness property stated in the 
previous subsection, which is model theoretic and which requires that the specification 
completely specify the behavior of the observers on intended inputs. Recall that the 

‘behavioral completeness property does not say anything about what can be deduced from 
the specification. In this sense, the relation between behavioral completeness and sufficient 
completeness reflects the power of the equational fragment of the deductive system. 

The well definedness property is stronger than the consistency property, because 


-18- 


the well definedness property not only requires that a specification specifies at least one 
data type, but also that it ‘Gpecification) is modular in 1 the Sense that. it preserves the 
specifications of other data types used init. 

. The completeness property is stronger than the sufficient completeness property, 
since in n addition to the requirement that the behavior of the observers can be deduced on 
any intended input by equational reasoning, it also requires that the equivalence of the 
observable effect of the constructors on intended inputs, can “be deduced from the 

specification by equational reasoning. | 


1.1.5 Correctness of Implementation 


We state the correctness criterion for-an-implementation coded ina programming 
language with respect to a. specification as a relation between the semantics of the 
implementation and the .semantics of: the specification. . Roughly speaking, a correct 
implementation implements one of the data types in the semantics of a specification. Our 
correctness criterion is weak as it does not: require a. correct. implementation to have the 
maximum amount of nondeterminism specified: by:a:specification... - 

_ We develop a method for proving correctaess of an: implementation with respect 
to a specification which embodies the correctness criterion: ‘Fhe'method requires, among 
_ Other things, that the procedures. implementing the: operations -satisfy the axioms and 

_. restrictions in the. specification when. appropniately..interpreted.:: We thus provide the. 
formal. basis. of the correctness ‘method proposed: by Gutiag:et al. {29} and -extend it to 
specifications ects nondeterministic: a i di is exhibiting exceptional 
behavior. . 
‘We distinguish among different procedures Pouca an: operation specified. | 
. to be nondeterministic, since the nondeterministic’ behavior. of an operation on abstract 

values ¢an be implemented by a: deterministic: procedure .on:.the: representation of these . 
abstract, valuies that returns different results on different but equivalent representations, 

We call a procedure nondeterministic (respectively; deterministic) if it is nondeterministic 

(respectively, deterministic) and it: returns equivalent results: on. equivalent representations. . 

Otherwise, if a. procedure-returns different. results on eqiiivalont:repzesentations, then it is 


-19- 


called pseudo-nondeterministic irrespective of whether it is deterministic or 
nondeterministic on the representations. We discuss the correctness method for these three 


kinds of procedures implementing an operation specified to be nondeterministic. 
1.2 Related Work 


In this section, we discuss different definitional and specification methods for data 
types, briefly stating the major differences as well as the main thrust of these works. The 
detailed comparison of these works with ours is contained in the rest of the thesis where we 
discuss various topics. 

_The definitional methods for data types can be broadly classified as (i) the 
algebraic or model approach, and (ii) ‘the axiomatic approach. In the model approach, a 
data type is defined as an algebra satisfying certain properties, or as a set of such algebras. 
AD] [23] defines a data type in this way. Though Hoare [37], Zilles [77], Guttag [28], and 
Berzins [3] do not explicitly define what a data type is, their approaches suggest that a data 
type is defined using the model approach. Our approach is also the model approach. 

Nakajima et al. [62] take the axiomatic approach; they define a data type as a first 

order multi-sorted theory. Recently Nourani [63] has also discussed the use of a first order 
theory for defining a data type. Though this view of a data type is useful in program 
verification, there is no explicit model of a data type to match with the intuition of a 
designer of the data type. Ifa first order theory is interpreted as in Logic [16] and its 
‘models are taken as the models of a data type being defined, then there are nonstartdard 
models for a data type, which are of no relevance to its designer. A nonstandard model 
does not satisfy the minimality property of data types discussed in the next chapter. Hoare 
[38, 39] has also used the axiomatic approach for defining a class of data types. — 

A survey of specification techniques for data types can be found in [47] and [48]. 
The specification techniques can be broadly classified into three categories based on their 
approach: (i) the ‘model approach, (ii) the algebraic approach, and (ii) the axiomatic 
approach. The model approach is used only in case a data type is defined using the model 

approach, A -data type is specified by presenting one of its models. Berzins [3] has 
formalized and extended the model approach originally proposed by Hoare [37]. He has 


a 


also related his research to other works auowte the-medel. approach. We discuss here the 
algebraic and axiomatic approaches... 

The algebraic approach has been iced by, Zilles. 07 and the AD] group. Ra: 
in this approach, a set of algebras defining a data type is presented as a finite set of 
identities or conditional identities. Burstall and Goguen [7] and Gégaen [20f Specify a data 
type as an algebraic theory. 

The axiomatic approach for specifying a data type can be used for either of the 
two definitional approaches discussed above. If a data type iS defined using the model. 
approach, a specification using the axiomatic ‘approach consists of the properties of the 
models of a data type. Otherwise, a specification consists of a subset of the theory serving 
as the definition of the data type. The axiomatic approach followed by Nakajima et al., 
Hoare [38, 39}, and Standish [69] uses the full first order predicate calculus to specify data — 
types. The approach advocated by Guttag ¢ et a. uses a restricted set of formulas, bites 
equations and conditional equations, — 

Our approach i is also axiomatic. A specification expresses the normal behavior of 
a data type(s) (which i is a set of algebras) as equations: and conditional equations, and its 
exceptional behavior as restrictions. Asi is stated i in ‘the previous section, “these formulas are 
interpreted using the restrictions in a different way than in the algebraic approach. In 
contrast to the specification methods proposed by Nakajima’ et al. Hoare, and Standish, the 
general properties of data types are not explicitly stated i in our method. A specification | 
provides an incomplete (in the sense of Logic) first order axiomatization of the data types 
being specified. From a properly designed specification, it is possible t to derive most of the 
interesting properties of a data type needed in program verification. 

The major focus of Zilles work and the ADJ group’ s work has been to extend the 
theory of heterogeneous algebras to capture the meaning ‘of data | types. They have not 
' investigated how to use the definition of a data type for: proving properties of programs 
using data types. Zilles i) has suggested | an ad hoc method for establishing correctness of | 
an implementation of a data type; | however, the method as well as its fouridation have not 
been fully developed. ‘The ADJ group and Ehrig’ et ‘al. ‘13 have proposed an algebraic 
approach for establishing the correctness of an implementation of a data type i in which they 


-2\- 


have attempted to incorporate the algebraic semantics-of the control structures of the 
programming language used for the implementation. Although the ADJ group’s work is 
rigorous, there are two main problems with it: 

(i) it has not embodied the view of data types taken in programming languages, and is 
thus useful only for a small set of data types, and 

(ii) it is complex. 
The approach taken by Burstall and Goguen [7] seems more promising than the ADJ 
group’s approach from the viewpoint of program verification, but, we have been told, its 
category theoretic semantics again seems to introduce unnecessary complexity [30]. 

. Guttag et al. have focused on using specifications for proving properties of data 
types and programs using data types. The nice aspect’of their approach is that it captures 
the view of a data type taken in programming languages. Our research formalizes, provides 
a mathematical basis for, and extends their approach. 

The ADJ group [23] has been the first to investigate rigorously the exceptional 
behavior of a data type. In their method, the set of values of every data type is extended to 
include a distinguished value, called error. Using special auxiliary ‘functions which test 
whether an arbitrary value is an error, they specify the exceptional and normal behavior of 

a data type. Goguen [20] has enriched and structured their approach: Our approach ts 
| based on Guttag’s recent suggestions: for separating the exceptional behavior of a data type 
from its normal behavior {31}. 


| 4.3 Outline of the Thesis | 


The second chapter introduces a formalism. for defining a data type. We first 
discuss the formalism for data types assuming that the operations do:not signal exceptions. 
Later, we extend the formalism to incorporate the exceptional.behavior of the operations. 

The third chapter describes the specification language, gives its. semantics, and 

defines the consistency and behavioral completeness properties of a specification. 
The fourth chapter discusses. the deductive system. We discuss how a theory of a 
set of datatypes serving as the semantics of a specification, can be constructed from the 
specification. We first describe the deductive system for specifications specifying neither 
nondeterministic operations nor the exceptional: behavior of. the operations; later, we 
discuss specifications specifying the exceptional behavior of the operations, and finally, we 
incorporate nondeterminism. We discuss the deductive. system incrementally introducing 
its various components; we first discuss the equational theory, then the distinguishability 
theory, later the inductive theory, and finally, the fll theory, 

The fiftis chapter discusses a correctness criterion. for an implementation with 
 Fespect to a specification and a methodology embodying the criterion. The correctness of 
recursive and mutually recursive implementations is also briefly discussed. 

The sixth chapter presents conclusions and directions for future research. 


-23- 


2. Definition of an Abstract Data Type 


In this chapter, we develop a formalism to define an abstract data type. We takea 
behavioral view for defining a data type in which every, value of the data type is constructed 
‘by finitely many applications of its constructors and these. values are distinguishable only 
by means of its operations. We adopt the mode|approach: A data type is defined to be-a 
set of behaviorally equivalent type algebras, where a type algebra..is. an extended 
heterogeneous algebra with additional. properties. needed ta model, data types. - The. 
syntactic structure of a data type determines the structure of type algebras in the set. Every 
type algebra in the set is called a model of the datatype. A model provides an explicit. 
‘meaning (interpretation) for the values and: the gperations of a,data type; in this way, it 
captures concretely the informal. description of a data type in our mind, The model 
approach for defining a data type is closer to the intuition. ofa programmer than the 
axiomatic approach as.in [62, 63], where.a data type.is defined as a first.order theory. 

_ The. crucial concept in the definition af a data. type is that of behavioral 
eauivalenee of type algebras. The definition of behavioral. equivalence captures the 
" informal. notion that two behaviorally equivalent.type algebras have the.samé behavior as 
observed through their operations. Weare interested in how. the. interpretations of the 
values and the operations of a data type. in a modal behave, and not. in how they are 
represented. We have decided not.to pick a Pagtipalar model, to be the definition of, a data 
data type. We have only considered the input-output behavior of ie ipuridinh of a ‘data 
Behavioral equivalence abstracts from (i) multiple. representations of a value for 
a representational structure as well.as from {ii) the. representational structure of the. values 
in an algebra. Thus type algebras differing only in the representational structure of their 
values are. behaviorally equivalent; furthermore, type, algebras using the same 
representational structure but differing in the number of representations a value has are 
also behaviorally equivalent. The property (i) above is achieved by defining a congruence, 
called the observable equivalence relation, on a type algebra, and the property (ii) is 


ve 


achieved by the standard algebraic: concept of isomorphism: . The. distinguishability | 
relation, which is the complement of the observable equivalence relation, on the 

representations of the values of the data type is defined inductively in terms of the 

‘distinguishability of the representations of the’ values of the defining types of the data type. 

(The basis of-this induction is any data type witt: no-defining types, and in particular, the 

data type boolean whose two values, ¢rue-and false,‘are assumed to be distinguishable.) 

Two ‘representations are distinguishable if and-only if there is a sequence of operations 

having an observer as the outermost operation; that produces stinguishabte results when 
| applied separately on the representations. 

If the operations of a data type signal exceptions, then two-representations can 
also be distinguished die to the exceptioriat behavior‘of the operations. If a sequence of 
operations signals on a representation: and ‘dées not signal on the other; of if it signals 
different exceptions on the two representations, then they are distinguishable. 

| ‘The model used for nondetermitism is'simple. ‘ If a ‘nondeterministic operation 
‘behaves nondeterministically on an input (e., it has a choice-to return one of the many 
possible results), we expect it to return every possible result. “Wé do not consider how these 
results are scheduled by an implemientation of the operation. Two operations having 
different amounts of nondeterminism are considered to have different observable behavior 
‘because for some input, they can always retur distinguishable results. The definition of 
distinguishability relation-on: Apicemanct 6 ofthe values of a data type incorporates this 

‘view of novideterminism. = . 
. In the fifst section, we imtroduce- sdiiniwibogy ‘define hierarchically structured 
data types, and informally discuss the minimality property of a data type. We assume data 
types to be hierarchically structured: and defined éne: at a'time. ‘There are however no 
technical problems in our formalism in handling mutually recursive data types which are 
not defined separately. We outline the simple extensions of the formalism to such data 
types in the last section of the chapter. ‘Until the point where we define a data type, ‘we 
have used the notion of a data type in an informal way to motivate the formalism 

developed. 

In the second section, we first introduce the formalism for: defining a data type 


-35- 


assuming that its operations do not signal exceptions. Our definitional method is 
hierarchical; we assume that the definitions of the defining types are given. We motivate 
and discuss in detail the distinguishability relation on the representations of the values. We 
then precisely define the behavioral equivatence relation on type algebras. 

In the third section, we incorporate the exceptional behavior of a data type’and 
dics extensions to the formalism introduced i in the second section. We extend a type 
algebra and the behavioral equivalence relation on type algebras to capture ‘the normal as 
well as the exceptional behavior of the operations. We compare our approach with 
Goguen’s approach of modeling the exceptional behavior [20,21]. We also formalize a 
simpler approach for modeling the exceptional behavior which has, been generally assumed 
in the literature on algebraic specification of data types [25, 27, 77, 23}... We compare our 
definition of a data type with the definition used by the ADJ group p23] which abstracts 
only from the representation structure of the values i ina type algebra. 


-%- 


2.1 Informal Description ofa ai te 


We use the data type finite set of integers for illustration: let Set-Int be its name. 
Set-Int has been widely discussed in the literature [37,76, 74,31]. It-has the following 
operations: 


Null a constant (or O-ary operation) returning the empty set of integers; ; 
Insert constructs a finite set of integers by ‘adding a ie ‘iftteger' to a given finite set of 


integers; 
Remove constructs a finite sct of integers by deleting a given integer from a given finite. set of 


integers; 
- Has checks whether a given integer is an clement'of a given finite s¢t of iritegers; 
Size results in an integer giving the size of a given finite sct of integers 
In addition, we assume that Set-Int has an additional operation Choose, which has 
-non-deterministic behavior. Choose returns an arbitrary ‘element of a given non-empty set 
of integers; for the time being, we arbitrarily assume that Choose returns the integer ‘0’ for 
the empty set. This behavior of Choose for the emnty set may not be adequate for some 
applications. In Section 2.3, we modify Choose so that it signals.an exception for the empty 
set. 


2.1.1 Terminology 


To simplify the mathematics, we assume that an operation has a cartesian product 
(possibly empty) of data types as its domain and a single data type as its range. An 
operation having a cartesian product of n data types (n > 1) as its range can be viewed in 
one of the following two ways depending on whichever is more convenient: (i) The 
operation is modeled as a family of n operations, each having the same domain as the 
original operation and a different type in the cartesian product as the range, or (ii) the 
cartesian product is viewed as a single type. We use the first method in the thesis. 

Let D be the name of a new data type being defined, and © be the finite set of 
symbols naming its operations. Let A’ stand for the set of names of data types appearing 
either as a component of the domain or as the range of an operation in Q. Let A be 


297< 


4’-{D }.' Dis the defined type and every data type in Aisa defining type of D. 

In order to include the syntactic specification (ie, the domain and range 
specifications) of the operations, we index every operation:s in Q by a pair (d, r), where d is 
a string made from the alphabet A’ and r is an element of A’. d specifies the domain of o 
and r specifies its. range. . —, 

‘Let Int stand for the data type integers and Bool stand for the data type boolean. 
For Set-Int, A = { Int, Bool }, A' = { Int, Bool, Set-Int } and 
Q = { Null, Insert, Remove, Has, Size, Choose }. The index of Insert “for example is 
-(Get-Int - Int, Int), : 

As is discussed in the first chapter, the operations of D can be classified: as 

constructors and observers. Let Q, be the subset of @ consisting of all constructors of D 
(recall that a constructor isan operation having D as its range). For example, Null, Insert, 
and Remove are the constructors of Set-Int. The constructors construct all the values of D. 
Some constructors construct a value of D. using only the values of the defining types of D. 
We call such a constructor a. basie constructor. For example, Natt ts a basic constructor of 

‘Set-Int. Every data type is required to have at least one basic constructor; otherwise, D will 
not have any values. - 

Let 2, be'the subset of @ consisting of all observers of D. An observer examines 
the values of D; it takes at least one argument of type D,.and ‘returns’a:value of a defining 
type of D. For example; Has, Size, and Choose are the observers of Set-lit: Every 
interesting data type must have at least one observer, otherwise there is.no way to 
distinguish among different values of D [25] other than by the operations signalling on the 
values. An- observer is also called an inquiry operation 77]. 

We thus assume -that every operation of D either results in a:value of D, or takes 
an argument of type D, or both. We consider a data type having an operation not satisfying 
this requirement to be not properly designed, because the behavior of such an operation 
oon not depend on the data type. ‘ 


1. Henceforth we will not distinguish between a data type and its name, and betwecn an operation. and its 
name, unless needed. 


ae 


Let 9,4 stand for the set of nondeterministic operations of D. We allow any kind 
of operation, an: observer or a constructor, to be nondeterministic. In our experience, 
however, we have found that a nondeterministic operation is often an observer. 2 


2.1.2 Hierarchical Structure 


We define the following two relations on a set of data types for capturing the 
_ dependency structure among the data types: | | 


Def. 2.1 D directly depends on every D’ € A, and does not (aiety depend on any other. data 
type. 8. 


Def. 2.2 D depends on D’ if (i) D directly depends « on D’, or (ii) there is a D” such that D 
directly depends on D” and D” depends on D’. 8 


The direct dependency relation captures one level of. hierarchical dependency. The 
dependency relation is the transitive closure of the direct dependency relation. We define 

(D)* = { D’| D depends on D' }, and 

(D)’ = (D)* U{D}. 
if data types are designed so that every data type on which D depends is assumed to be 
designed independently of D, then the dependency relation on (D)+ will not. have any 
cycles and is a strict partial order on data types....In such a.case, data types-are said to be 
hierarchically structured, and they can be defined incremeatally one at a time, Data types 
on which D depends do not have to be designed in‘any particular order relative to D; any 
approach, for example top-down, bottom-up, etc., is compatible. Unless stated otherwise, 
we assume in the thesis that: data types are hierarchically structured. . 

We assume thatthe partial:order induced by, the’ dependency relation on the set of 

hierarchically structured data types has finite descending chains. The bottom of every 


2. In case a constructor o is nondctcrministic, o is usually derived with respect to a subset &, of deterministic 
comrucors (y GO,) ia the sense that docs not reir any value that cannot be conducted ang the 
constructors in Q, . 


-29- 


chain is a data type having no defining type. Throughout this thesis, we assume that the 
data type boolean does not.have any defining typé; Bool serves as the bottom element of 
the chains in the partial ordering for all interesting data types as will be clear from the 
discussion in Section 2.2. (The definition of Beol is given in Section 2.2.) We will often use 
the structure induced by the dependency relation on the set of data types for inductively 
defining properties of data types, as well as for proving properties about.data types. Bool 
will often serve as the basis step of such ‘definitionsand proofs (in: general, data types 
having no defining, type serve as the sei 


2.1.3 Minimality Property 


The requirement on a data type behavior imposed because of the modularity and 
good program design considerations that its values be manipulated only by its operations 
translates to requiring that its values be constructed only by its constructors, possibly using 
abstractly the values of its defining types. Furthermore in a computer the values can be 
constructed only by a finite sequence of operations, so the values of a data type constitute 
the smallest set closed under finitely many applications of its constructors. We call this 
property of a data type the minimality property. 

We require that every data type under consideration satisfy the minimality 
property. This requirement constrains the implementations of a data type to be protected in 
Morris's sense [59]. An implementation of a data type defined in a strongly typed language 
that hides the representation of its (data type) values from its users by providing an 
encapsulation mechanism, as in CLU, ALPHARD, etc., is protected. The minimality 
requirement does not rule out data types defining ‘infinite’ values, insofar as these values 
can be finitely described. 3 


3. For example, we can define a data type infinite sequence of squares; whose valucs.arc infinite sequences. of 

consccutive squares starting from n2, for every n> 0. It has a constructor, Cons, which takes a natural 
number'as an argument and returns! an infinite sequence: -In- addition, it: has thece obseryers -. First, which 
gives the first clement in the ‘soquerice; Rest, which.givcs-the. remaining scqucnce. after stripping the la 
sequence; and, Equal, which checks whether. two infinite scquerices are equal or not. . ; 


ee 


The minimality property serves as the basis of a:powerful: induction rule for a data 
type D: To prove that a.property. P:-holds for D, i.e., for all. values of D,; we need to shaw 
that P is preserved by every constructor of .D.:: " Weghreitiand.. Spitzen [72] called this 
generator induction; Guttag. et al. [27]. calléd- it. data‘ type | ‘induction. We discuss this 
induction rule in detail in Chapter-4 on the deductive system. for data types. 

Since every operation of D'is-assumed to be-computable, it can be easily shown by 
‘induction on data types, that the set:of values of:.D’ is recursively. enumerable. .. Phis is 
based on the fact that the set of sequences of constructors 4s recursive... This thesis considers 

data types with a recursively enumerable set of values and a finite set of total acu nee . 
operations. eS sa” : 


4A set is recursive iff #8 chatactoristic. function, which chock wiscthes a gives clement i isa member of S 
‘or not, is total computable: A-sct S @ recursively enumeroble (s2. coniparmigoaioianaed nenaes 
‘function. In other words, ar ne. sct'S can be listed by w total computable function. 


54} 


2.2 Formalism 


In this section, we describe the formalism to state precisely what a data type is. To 
simplify the presentation, we assume that data types do not have any exceptional behavior, 
i.¢., their operations do not signal any exceptions. Every operation terminates normally on 
every input in its domain. | 

This section is organized as follows. We first extend the notion of a 
heterogeneous algebra as defined in [4] to model nondeterminism; then we define a type 
algebra to be an extended heterogeneous algebra with additional properties. The domain 
corresponding to the defined type D consists of the representations of the values of D and 
is called the principal domain of the type algebra. To extract the behavior of a type algebra 
as observed through its operations, we must aa . | 

(i) abstract from the multiple representinions of a value, assuming a particular 
representational structure, and aa : 
_ (ii) abstract from the representation structure of the values ee in a type 
algebra. | 
To do the first, we define an interpretation of a term in a type algebra, where a term 
expresses a sequence of operations. Terms are used to observe the behavior of the 
representations of the values of the. defined type in a type: algebra in terms of the 
representations of the vahtes of the defining types. We define the observable equivalence 
and distinguishability relations on the principal domain of a type algebra. Thése relations 
are defined inductively using the corresponding relations on-the domains corresponding to 
the defining types in the type algebra. Observable equivalence is an equivalence relation 
and is presetved by the functions in a type algebra; it relates two-values having the same 
behavior. We then define the behavioral equivalence telation on type algebras which relates 
two type algebras having the-same observable behavior. A data type is an equivalence class 
defined by the behavioral equivalence relation, atid’ every type algebra in the equivalence 
class is a model of the data type. A model of a data type concretely defines the value set, 
- which is the principal domain of the model, and the operations of the data type. 
Most of the definitions throughout this section are inductive; they. make use of 


-32- 


the dependency relation, which is a strict partial order with finite déseénding: chains, on 
hierarchically structured data types. An inductive defi nition of a concept has three parts: 

(i) Basis part, which deals with the case of a data type D having no defining type, i. e., its A 
is the null set, 

(ii) inductive part, which deals with the case of'a data typé having defining types, and 

(iii) closure part, which states that the above two ways are the oily ways ‘of defining a 
concept. 


To avoid repetition, we omit the closure part, and if the. basis part ean be derived from the’ 
inductive part by assuming A to be the. null Set, we give only the inductive part of the 
definition. Some of the definitions - the. definitions of type algebra (Def. 2.3), 
distinguishability and observable equivalence relations (Defs..2.6 and. 2.7) and data, type 
(Def. 2.14 ) are mutually recursive. The. definitions 2.3, 2.6, and 2.7.assume the ee 
of the defining types in A in their inductive part. ; . | 

We would like to. motivate various concepts and definitions jntodiced: on 1 type 
algebras. ‘So for exposition purposes, .we may refer to a type algebra as though it is a ‘Model 
of a data type being discussed.> 


2.2.1 Type Algebras. 


A heterogeneous algebra as defined by Birkhoff and. Lipson [4} is a finite indexed 
set of sets (called domains in the thesis) and a finite.indexed set of total functions. We 
extend this definition to model the nondeterministic operations of a data type. An 
. extended heterogeneous algebra can have either-a total. (deterministic) function or a total 
nondeterministic function. 

_ A nondeterministic function f: XY is similar to: a fan in fiatheendiics with 
the exception that it has a choice among: a subset of possible.results when applied on an 
input x € X. Let f(x)stand for an arbitrary result of applying.f.on x. f can be characterized 


5. We are technically justified to do so as almost every type algcbra is a mode! of some data type. 


- 33- 


using a relation R C X xX Y such that f(x) € RQ ® If R(x) js a singleton set for some input 
x, then fis said to be deterministic on x. By { f(x) } we will‘niean the set R(x); in this way 
we do not have to refer to R. Since. we assume every nondeterministic operation to have 
finitely many choices on a particular input, { f(x) } ts always a finite set. We admit that 
calling f a nondeterministic function is an abuse of the term function: however, we féel this 
term conveys the behavior of f well. Henceforth, by the term function we mean either a 
mathematical (deterministic) function or a nondeterministic fernction, ‘unless qualified. We 
' have chosen a nondeterministic function over the corresponding deterministic relation for 
| modeling a nondeterministic operation because ‘(i) in: Contrast to the’ nondeterministic 
function, the relation models the nondeterministic operation indirectly, and (ii) it is 
inconvenient and unnatural to express the behavior of a computation scheme involving 
nondeterministic operations by means of the relations corresponding to the 
nondeterministic operations. | a 

The definitions of concepts such as congruence, f@inomnorphicn. isomorphism on 
heterogeneous algebras [4] are revised for extended heterogeneous algebras in Appendix Il. 
Henceforth, we use the term heterogeneous. algebra to mean an extended heterogeneous 
algebra. 

A (ype algebra is a heterogeneous. algebra with additional properties. For a data 
type D, we are interested in type algebras having.a particular structure, which is determined 
by A’ and @ of D. The sets A’ and Q serve as the index: sets of the type algebras of interest 
for D. We.call such an algebra as an algebra of type D or simply a type algebra when. D is 
evident from the context. The triple (D, A; Q) is called the (similarity) type. of such an 
algebra. An algebra A of type D consists of a domain corresponding to every typeé name 
D'€ A‘ and a function of the appropriate arity corresponding to every operation name in Q. 
The domain corresponding to D is the principal. domain of A. The function corresponding 
to a is called the interpretation of the operation symbol « o in A. The domain corresponding 


to a defining type D’ € A is the interpretation of D'. 


6. For a relation R, a subset'of X X Y, R(x) stands for the subset Lyle PER} of Y for an x€X, and 
R(A) stands for { y| <x y €R,x€A},whereACX. © 


- 34- 


We assume that every defining type Dt in. of D,is defined elsewhere, and we are. 
given the models of D’ (see Subsection 2:2;6 for the definition of a.data type and a model of 
a data type).. The interpretation of a data type D’ € 4 in an algebra of type D is fixed, We 
use the models of each D' € A to define type:algebras of D. .The domain. corresponding to 
D' € A in a type algebra A of D is the value set.of D’ defined by some model A‘ of D’. A 
type algebra A of D explicitly includes only the interpretations.of the operation names of 
D, and does not include the interpretations of the operation names of any defining type D’. 
We assume that every operation. name.of a defining. type D' has the same interpretation in 
A of D as its interpretation in the madel A’ of D’. In this way, we.define the interpretation 
of every operation name of a data type D"€ (Db) in.a type algebra.A of D.7 An algebra.A 
of type D is thus really a huge structure having-interpretations for every data type in.(D) . 


Def. 2.3 Analgebra A of type D is a heterogeneous algebra 
[{Vy IDEs}: {f,loea}] 
such that ee ae eee eles on 


@ for every enue: type D € A, Vey is the value set: of defined by a - 
model A‘ of D’, 


ii) for every « € Q, f, is a total function of the-appropriate arity, i.¢., if ¢ has 
D;x. sae as its domain. and “D's: its -range® then {has 
vy * XV, as ts. clommein-and¥ jy aes tangs, and 

fii) = Vp 1s the smalest set closed und finitely many appications of the 
ice s corresponding $0 the constructors of Bic... 

Vp = u , Vo- Where Vp = 8 and 


Vp = ype . ¥,) [for each « € 9, such that 


a | 
o:D,Xx.. _xD,—D, ¥,€ U VpifD, = = D,and »,€ Vp ifD, #D}. 
= RE ‘pee (P 

a 


7. Recall that (D)" te tconssing oD anal data yes ich D pend, 
8. Le., (D,° °-D, . D’) is the index of o. oubay 


- 35- 


So, V1) is the principal domain of A, f, is the interpretation in A of the operation name 
«€Q. We do not require the interpretation f, of ¢ to be a deterministic function if o is 
deterministic and f, to be a nondeterministic function when o is nondeterministic; the 
reason for this will become clear in Subsections 2.2.5 and 2.2.6 on ‘the’ behavioral 
equivalence of type algebras and the definition of a data type respectively. 

If any f, in-A is a nondeterministic function, then A ‘is called a nondeterministic 
type algebra; otherwise, if every f, is deterministic, then A is called a deterministic type 
algebra. Henceforth, in the context of an algebra A of type D, ane operation o we mean. 
its interpretation f, and by a value of D we mean an element of Vp ee 

The property (iii) above i is due to’ the requirement that D satisfies the minimality 
property. For a constructor o, if f, is nondeterministic; thea V,, is.closed under f, 
assuming f, could return any passibte result for an igput.Onog the value set corresponding 
to each defining type D’ is fixed, then obvipusly Vp, is: uniquely determined <by 
{f,lo€a.}, and is nonempty, because. @, is nonempty and has at least one basic 
. constructor (see Section 2.1). Hy 


2.2.2 Examples of Type Algebras” 


We discuss below a oe algebra A A, of Set domes A, jis a natural model of Set-Int 


interpretations of its operations are s defined i in terms of ite standaid s set et operations te. 


Aas Z,B }; { Nu, In, Re, Ha, Si, Ch}, 
where = { true,: false }, a value set of Bool, . . 
= {0,1, = ay ey arenes value st of Ig and fo 
{ 2, {0}, . {1}, 1}, {2}, 2}, {0, Ik $4 -1}, {0, 2}, 
{0-2}, {1, -1}. (I, 2h, . _ }, the domain, corresponding to Set-Int. 


ce 
4 


The domains Z and B are defined elsewhéte by the models-ofint and Bool, respectively. 
The first two letters of an operation name are used to denote ify A,, the total 

function corresponding to the operation. These furiétions ate défined below. We will use 

any convenient mathematical’ formalism to give the definitions of the functions. We use 


-% - 


the symbol ‘ a> as the definition symbol; the symbol *:* marks the beginning of a 
comment in a definition, sunning until the end of the line. 


Nu 4 ) 
Ins) & s U {i} | 
Re(s, i) Q s. {i} ; - is the difference operator 
Hafs,i) 2 i€s 7 : 
sis) & 2) ; the cardinality of the set 
cs) 4 ifs=o 
a such that i € s, otherwise. 


Chis a nondeterministic total function; ifs is not 2, then {<Ch{s) } ='s. : 
We discuss another type algebra Ai, of Set-Int in. which: the set values are 
represented as finite sequences of nonrepeating integers: .. 
Al, = [{SQ'.Z, B}; {Nu}, In’, Re!, Ha’ Si, Ch! 

where SQ!’ = {0,<0,<D, <D, <2, <¢D, 0, D, <0, -1>,€0, 29; 0;-2, 

<1, 0, <1, -1D, <1, D, <1, -2>, €1, 0, <1, D, C1, D, <1, -D, 

<2, 0>, <2, D, ..., }, the domain coitésponding to Set-Int. 
: The set SQ’ contains all finite sequences of i integers. not having multiple occurrences of the 
‘Same integer, for example, <0: “0, <0, 1, -1, oe are not in SQ’. Lets stand for an element of 


SQ’. So,s = <i, » eee), > 0; ifm = 0, thens = ©. 

Ne! 4 © | 

In(<i, ...,.4,>,) 9 eat asismi si 

| ee “otherwise 

Rei, +254), a ia hy hair ta? ~  -¥] Sism,i= 
| Ci,.6> = > otherwise 

Ha'(s, & (rue shasag ee a 

siggy 2 m | | 

Chi, ...,49) & (0 m=0 
| }  Asjism>0 


-37- 


Ch' is a nondeterministic function; { Ch'(<i,...,1.>)}.= {4 .-..4,} form>0. 
2.2.3 Interp retation of Terms . 


A term is constructed using the operation names of types in (D)" and the typed 
variables. It expresses a sequence of operations, so it forms a straight line’ program. ‘The 
interpretation ofa term in a-type algebra is like the. execution of such a program. . The 
interpretation of all terms characterizes the behavior.of the algebra. . 

We assume that we have as many variables atl infinite) of every type 
D‘¢ (D) as needed. 


Def. 2.4 A termof type D'€ (D)’ is defined inductively as follows: 


(i) A variable x, of type D' is aterm of type | D, 


(ii) if « is an operation of some. type D'€ (D)" ‘such that its dotnain is 
D, x ... xD. and ‘its range. is Di, then:’ re is a term.of type. ; 


D’ if and only if each ¢ aterm Of type €(D). 


If a term has no variables, it is called a ground term. A térm of type Bool is calted a hecho 
term. When we wish to refer to the variables of e, we write.e as e(x,, ...,x,) (or e(X)), 
‘where the set { Hs coe } (or X) consists of all variables i ine: A subterm: of a term that is 
a variable is the term itself. The subterms of a term of the. form: ‘o(e,,..., ¢,)' are (i) the 
term ‘o(e,,...,¢,)' itself, (ii) all subterms of e,, ...,¢,, and nothing else. . 

An interpretation of a ground term e in sate seas A: of type D is piaiued by 
performing the sequence of operations. expressed. by e. A ground. term..e of type. D‘-is 
interpreted in A as follows: If e is a O-ary operation name is, an-interpretation of e ig the 
result of applying the interpretation of 6 in A. If eis‘ o{e,,.-.,¢).° an interpretation of e 
is the result of applying the interpretation of « in.A: on the interpretations of ¢,,....¢, in 
A. An interpretation of e is an element of ¥).. Since ¢ may be constructed using 
nondeterministic operation names, ecan have many interpretations. Let é| 4 stand-for an 

arbitrary interpretation of ein A. 
For See jet us assume that the Sue type Int. of Set: “Int ae ‘the 


38. 


constructors 0, 1, 2, and 3, and that they have the standard interpretation in a mode! of Int. 
Then €, = Insert(Insert(Null, 0), 1) and e, = mene) © are com terms of types sales 
and Int respectively. We have, . 
lay = (01 }and 
el A,, = Oorl. 


Fas every operation name of a data type Dr€ (D)" has a total function’ as its 
interpretation in an algebra A of type D, we pee 


Prop. 2.1 Every ground term of type De C (p)* ‘hes an ‘ieipretation't in A t 


Furthermore, since every data type under consideration has the minimality property, we 
i par tg, ee 


Prop. 22 Every value in Yp! is an interpretation of some around term of type D. 
Proof ‘Straightforward, by induction on type slgcbies sdcihg hedependoney relation. § 


For a term e of type D having variables, is interpretation ‘isa function, which is 
denoted ‘by f,. If e has nondeterministic. operation: names,.thenf,.is in general a 
nondeterministic function. Let {.x,,...,x, }-be the only variables in pee D, be the type 
of x. Then f, has Yo, Kuck Vo. as.its domain,and Vj); its range. If the variables 
X,...5%, ine are instantiated in A to be the valves -¥; . ..,¥. respectively, from. the 
appropriate domains in A, then e(x,,..., x} is said to ‘be. instantiated in Aas 
e[x, /v,,...,x,/v }, and ‘canbe intespreted:in A. The aisignment b,/¥, +025 X/v, ] is 
called an A-insiance. of Xjo-++>%,, and each y- is called an instance of x, . (We will 
“abbreviate the assignment as [¥/V], where. ¥ stands for (v,;- ..,.v.}.}. An-interpretation, of 
~e[X/V] in A, written as e [X/V] a: is defined as follows: s 
(i) If eis a variable x,, then e{X/V]', = v, and: 
~~ (ii) ifeis of the form ofe,,...,¢,)'.m20,° 
then e{ X/V Hg =f,Ce [A/V g.---.e, LX7VM yD. 
fis e[X/Vl oe 
Interpreting a ground term or an instantiated term in-A is thus like performing a 


-39- 


computation: an interpretation is the result of the computation. 
2.2.4 Observable Behavior 


The behavior of a sequence of operations of a data type D, strictly speaking, 
becomes externally observable if the sequence has an effect on the outside world, for 
‘example, the sequence of operations ultimately results j in some output on an 1/O device, 
such as a line printer, CRT, etc. In this sense, the distinction between two values of D is 
observable if and only if there exists a sequence of operations such that when applied on 
the values separately, it returns distinguishable outputs on an 1/O device. An output on an 
1/O device can be considered as a sequence of characters, ‘and we can have a predicate on 
the outputs, resulting in the boolean constants T and F depending upon whether the two 
given outputs are distinguishable or not. In this way, we can define the distinguishability of 
the values of D using the distinguishability of the boolean constants. We stop at Bool. As 
was stated earlier, we use the definition of Bool ‘as the basis of our formalism. In fact, any 
data type (or a collection of data types) whose values can bé distinguished a priori (outside 
the formalism) can be used as the basis. For instance, a data type directly supported in a 
programming language whose values are distinguishable ea the literal mechanism in the 
programming language can be used. 

We structure the above informal definition of’ distinguishability using the - 
dependency relation on data types. Instead of defining the distinguishability of the values 
of D in terms of the distinguishability of boolean values in ‘a’éingle step, we do it 
incremenitally. We assume that the distinguishability relation is definéd on‘ the values of 
every defining type D’ € A, if any; in this way, the behavior of the values of D'can be 
incrementally observed through its observers. Except for Bool, if D does not have any 
observers, i.e., its Q, is the empty set, then the values of Dare not distinguishable, as there 
is no way to tell whether any two values are different. That is why wé-remarked earlier that 
“every interesting data type must have at least one observer. 

For a D with a nonempty set of observers, it is generally: not sufficient to examine 
the values of D directly by the observers due to the possible delayed effects of the 
constructors. The distinguishability of the values may ‘not manifest ‘itself until some 


-4- 


constructors are applied on them. For example; two different-nonempty. stacks of the data 
type stack of integers may have the same integer as their top element, so they cannot be 
distinguished directly by the observer Top. But if we apply thé Pop operation first on the 
two stacks, then the resulting stacks may. be directly Aistinguishable by the observer Top 
thus exhibiting that the original s stacks are also distinguishable, There i is generally a need to 
perform a sequence of operations with an observer of D as the ast operation in | the 
sequence, to distinguish two values of D. | 
In formally, two values of D are distinguishable if and only if either 

_ (i) there is a sequence of deterministic operations of D such that when it is applied on 
the two values assuming every other argument: of the sequence fixed, it results in 
distinguishable values of some defining type D'€ 4, or one 

Gi) there. is a sequence including nondeterministic operations such. that the result of 
applying it on a value for some choice made by the nondeterministic operations i is 


made by the nondeterministic operations «ade as 


if two values are not distinguishable, they are. called ee eae _For better 
exposition, we have deliberately structured the definition of. distinguishability into. two 
cases, though the second case can be modified to cover the first-case... The second case may 
appear to-be a very strong Fequirement, but 4 small.amoupt of thinking should convince 
the reader that such is not the case, as we. definitely do not. want_a value. to be 
abasic from itself. Furthermore, obeeweule equivalence should-be an palulyaienes 
below these requirements. in the context of a. one sigetie ee ‘Mlustrate iain using 
examples, 

The operations of a alias type must. is preserve. the. observable equivalence 
relation on the. values.of every defining type in the. sense that the eperations cannot 
distinguish among the observably equivalent values of a defining.type. This requirement 
on the operation behavior is necessary because. of the modular.structure of data. types. A 
new data type should not impose any. additional. structure on the values of any of its 
defining data types. This property of a data. type is guaranteed in all programming 


-4]- 


languages supporting an abstract data type mechanism in which an. implementation of a 
data type is hierarchically structured and the representation is hidden from the users of a 
data type. “ 
We would like the type algebras to fave sige above prapertics. Definition 2.3 of a 
type algebra does not guarantee them, so we put an.additional constraint on a type algebra. 
We first define the observable equivalence: relation Ey on. thé: principal domain Vp of a 
type.algebra A; we will assume that the observable equiyalence relation Ey: on.V,y in A is 
defined for each D' € A by a model A’ of D’ having V,, as. its principal domain. We show. 
that E,, as defined below is an eautvalence) relation. Later we define a well formed type 
algebra whose functions preserve the set E = { Ey |D'€ a} of observable equivalence 
relations. Only the well formed type algebras are of interest for defining a data type. - 

In the above discussion, we have only considered the input-output behavior of the 
operations for distinguishing different values. We haye not,considered the efficiency of the 
operations. In case of nondeterministic operations, we have not considered how possible 
values that a nondeterministic operation can return on a particular input are ‘scheduled. 
Our formalism i iS limited i in this sense. 


2.2.4.1 Definitions of Observable Equivalence and eee: 


We give ‘the basis and the inductive parts, of the inductive defi nition of the 
distinguishability relation. The basis pat is. the casé when D does not have any defining 
type and the inductive part iS. the case when D has defining ‘types. In the basis Part, | there 
are two subcases: (i) D is Bool, and (ii) D is different from Bool. We first define the data 
type Boot and then define the distinguishability ‘relation on the fivodels of Bool. © 

The data type Bool does not have any defining types and is self-contained: We 
present below a model of Bool and call it B. a ase 

B = ({ {1me, taise }}: {T.F, Vv. nhs =>, =) where : 


74 


F 4 faise 
4 


~true = _ false 


-42- 


~ false 4 true 

true V true 4 true 

true V false a true ~ 

‘tase Virue & true: 

_ fatse V false a fatse 

xAy & ~(~x Vy) 
xeoy @ (~x)vy 

xery @ (Vy) VK AY) 


The interpretation of T is the logical value true and the interpretation of F is the logical 
- value tatee. oe 


Def. 2.5: The data type:Bool is the set of all type algebras isomorphic to B. & 


We will often use B as if it is the only model of Bool, and interchange between T and its 
interpretation true in B as well as between F and its interpretation false. We assume that 
the boolean constants T and F are distinguishable from each other a priori, meaning that 
their interpretation in every model of Bool is distinguishable. “Each boolean constant is 
observably equivalent to itself. 


Def. 2.6.1 Let A be a model of Bool and Vizoo! be the value set of Bool defined by A. The 
observable equivalence relation on Vp is act ned to be ‘the ‘identity relation on Vizool 
The distinguishability Telation on Vesoot ' is defined to be the complement of the Searet 
equivalence relation with =e to the universal relation on Visp0I oi ( e., Vase Bool x Vioot . 


The sins aarponent of the bala part of, the definition -of distinguishabjlity is 
‘now given. 


Def. 2.6.2 For any data type D other than Boot not having ariy wee type, no value in 
Vy of an algebra A of type'D is distinguishable from any other vatue chy, 


The inductive part is as follows: 


-43- 


Def. 2.6.3 Two values v, and v, in V,, of a type algebra A are distinguishable iff there isa 
term of type D' with exactly one variable of type D, expressed as c(x), such that the 
instantiation c [x/ vi] interprets in A to a value of a type. D’ € A (an element of Vy) that is 
_ distinguishable from every possible value to which the instantiation c[x/v,] interprets, or 


- vice versa, & 
The case 2.6.2 above can be derived from the case 2.6.3. 


Def. 2.7  v, and v, are observably equivalent, ic., (v,.v,)€E, iff v, and v, are not 
distinguishable. § — ; 


It should also be obvious from the above definitions that if D. does not have any observers 
aad D is different from Bool, then all. members of V5 are observably equivalent.’ The 
following definitions are useful in dealing with data’ types. pean: nondetermniniatic 
operations. 


Def. 2.8 Given two subsets A, and A, of Vp A, is observably equivalent to A, and vice 
versa, iff (v v.€A JG v,€A SG, /y> € Eyhand vce versa 8 


Def. 2.9 A, and A, are distinguishable iff A, and A, are not dbservably equivalent. a 


Then the case 2.6.3 can be rephrased as: 
v, and v, are distinguishable iff there is a term Ae such iat - deh il A bis is 
distinguishable from { dx/v,l pg }- 

Consider the type algebra A gi Of Set-Int (see Subsection 2.2.2). ‘Itcan be proved 
using the definition of Int that the observable equivalence. relation on. LL, the value set of Int 
used in A,,, is the identity relation. Then the sets {} and {0} are distinguishable since the 
term Size(x) distinguishes taem, The sets {0,1} and {1, 2} are also distinguishable since 
the term Choose(x) distinguishes them: An interpretation of Choose({0, 1}) is either 0 or 1, 
and if 0 is chosen as an interpretation, there is no interpretation of Choose({1, 2}) returning 
0. By similar reasoning, {0,1} is also distinguishable from {0}. {0,1} is observably 
equivalent to itself. The observable equivalence relation on the principal domain of A,, is 
the identity relation. However, it can be shown that the observable equivalence relation on 


44+ 


the principal domain of Al, is not the identity relation, because for example, <1, 2 is. 

observably equivalent to <2, D. In fact, any two seyuence having the same set of integers 
_ are Observably equivalent. In al, ; aa 

Eris at = ASS, s2> [sli is a permutation of s2 }.. 


Thm. 2.1 The observable equivalence relation E,) is an equivalence relation. 
Proof That E), is reflexive and symmetric is obvious from the definition. ‘The transitivity 
of-E,, can be shown. by induction on type algebras using the dependency telation. § 

The requirement that the functions in a well formed type algebra A preserve the 
observable equivalence relation Ep: for each BD’ € A’ is ‘equivatent to requiring that 
. E={E,,[D'€a’} be'a congruence on A.:, where a congréence on a heterogeneous 
algebra is defined in Appendix II. 

Def. 2.10 A type algebra A is well formed if and only if E is a congruence on A. ie 


Since we are interested only in well formed type algebras, by: a type algebra v we henceforth 
mean a well formed type algebra uriless stated otherwise: 

_ For,example, both A,, and Al are well formed. E' = { Fc oe » Eines ; Exo! } 
in case of Aci | , where Bat and Fn are - identity relation, is.a congruence on Al, 


Thm. 2,2 Assuming that Egon | is the largest congruence. o on. a ‘model of Bool, E is the 
largest’ congruence on A. 


Reeol See Appendix Ik s. 
The above theorem implies that the observable equivalence felations'‘on the domains in A 


completely extract its observable behavior in the sense that in ‘the quotient eee AE 
induced by Fon A, every value i is cistingulshable from: each other. 


- 45- 


2.2.4.2 Reduced Algebras 


It is technically cumbersome to deal with a type algebra having distinct but 
observably equivalent values, so we introduce the notion of a reduced algebra. 


Def. 2.11 An algebra A of type D is called reduced if and only if for each D’€ a’, E,y is the 
identity relation. # . 7 


So all members in every domain of a reduced type algebra are distinguishable. For 
example, A,,; is reduced, whereas Asi is not. B, the model of Bool, is also reduced. 

Gh an algebra A, we can get its reduced algebra by taking the quotient of A 
wit E={ Ey De A’ is since E is a conigruetice on A. The reduced algebra 
corresponding to Ais a 

ASE = =[{£V)/E) 1D €a'}; {g, loea}], where 

gv) .--.¥.) = 160, -.. yy? | ; 
The Sina domain of the reduced algebra corresponding to an algebra of D having no 
observers, where D is not Bool, will have. a single element. The reduced algelira 
corresponding to A, 1 has as its principal domain . 
SQ'/Eset-int = = 110}, {<b}, {< 1D}, {<2}, {<2}, 
{ <0, 1, 1, 0}, £0, -1D, SLOr: sg 


2.2.5 Behavioral Equivatence of Type Algebras 


As was stated at the beginning of this. section, in-order. to abstract-the observable. 
behavior of a type algebra, we must abstract from_(i) saultiple ‘representations of the values 
ofa data type in the type.algebra as well as from (ii) different. representational structures 
used for the values in different type. algebras. The: observable equivalence relation. 
discussed above does the first task. It identifi les representations having the same observable 

‘behavior. For the second task, we employ the standard algebraic concept of isomorphism. 


9. It can be easily shown that A/E is also a type algebra. 


va 


By combining the two, we define the behavioral equivalence relation on type ‘algebras as 
follows: 


Def. 2.12 Type atgebras A, and A, are behaviorally equivalent if and.only if the reduced 
algebra corresponding to A, is isomorphically equivalent to the reduced algebra 
corresponding to A, . a ~ | 


We later show that the above definition indeed captures the desired intuition that 
two o behaviorally equivalent algebras have the same observable behavior. By this, we mean 
that an interpretation of a ground term ¢ in one algebra behaves the same way as an 
interpretation of ¢ in the other algebra, when manipulated by the operations. (Informally 
speaking, a computation results in equivalent values in two related type algebras.) . 

The isomorphic equivalence .af two _ type algebras . is. _ Stronger than the 
isomorphism of the two type algebras if considered : as they are... if D does not have any 
defining type, then. isomorphic equivalence i is the saine as. the isomorphism. However, if 
two type. algebras are considered in the expanded form in which they” have a domain 
corresponding to every data type Dp" € (D) and. a function corresponding to every 
operation of D", then isomorphic equivalence i is same: as ‘isomorphisttt. ‘Since we do not 
wish to carry all this information ina. type algebra and consider a type algebra in the 
expanded form, we assume that for each D’ in A, the models of D’ defining Vp tand Vi / 
the value sets of D’ are isomorphically-equivaleat, end. there.i¢.a,hijection dy: from Vyy:t0 
Vj defined by the isomorphic equivalence relation. We thus do not use any arbitrary 
bijection from V,}, in A, to’), in’A, to show isomorphic equivalence between A, and A,. 
Instead, we build the ‘bijections bottom up establishing correspondence between the values 
—— The set { #),, {Dea} eR RE. from: mp Yp so that 

= = {%y ee pecmnmennea oer xs to A). 


Sept ee aS nal 


- 47- 


Def. 213 Given two type algebras A, and A, such that for each D'€ A, the models 
defining Vn’ and V0 as the value sets of D’ are isomorphically ‘equivalent, which defines a 
bijection op: Vn v2. , A, and A, are isomorphically equivalent if and. only if there is a 
bijection ©, from V,) to Vi a that ® = {o).|D' € a’ 3 is an isomorphism from A, to 
A,5 


Note that both A, and A, above are either deterministic or the corresponding functions in 
A, and A, have the same amount of nondeterminism. 
For examples, the models of Bool are isomorphically _equiyalent. The type 


algebras A,, and Ai, of Set-Int are behaviorally equivalent because A. and Aj, /E are 


si 
cnoniheally equivalent. We can define i other type algebras Ae Set- Int which are 
similar to Ali: The type algebras A2. eis Aj, ‘ apd: AN have sets repéesented by finite 


ordered sequences. of nonrepeating integers, finite ordgred sequences of repeating integers, 


oe finite scsi eaves of repeating serge sean the definitions of 


Al A? 


. si?’ 


ai As, ,and A§; are beharbially equivalert, 
Note that two evn equivalent type, algebras need not t have the same 

amount of nondeterminism. In fact, one could be deterministic whereas the other could be 

nondeterministic because the possible regults returned by a nondeterministic function. on 

an input in such a nondeterministic algebra are observably equivalent. 

From the. definitions of isomorphic equivalence and behavioral equivalence, ' we 


have the following: 
Thm. 2.3 A, is isomorphically equivalent to A, = A, is behaviorally equivalent to A,. 


Proof Assurtie A, and A, are isomorphically ‘equivalent. ‘Let E, and E, be the sets of 
observable ee relations on A, and A, respectively. ‘Then, A /E, and A, E,, can be 


Shown to be isomorphically ee, (By Theorem 22, E,i is the largest congruence on 


A, and E, is the largest congruence on A,.) So, A, and A, are befiaviorally equivalent. # 


| 4g - 


Thm. 2.4 The behavioral equivalence relation on type algebras is an equivalence relation. 7 


Proof The reflexivity and symmetry property are obvious from: the definition. The 
transitivity can be proved from the fact that composition of two. isomorphisms is also an 


isomorphism. & xo 


The behavioral equivalence of type algebras A ‘ and A, can be expressed as 


such that the above diagram commutes, ie., 

o.H, = H,.¥. (+) 
(The function f . g has the same behavior as applvitg g first and then applying f on the 
result.) E, and E, are congruences consisting of observable’ équivalence telations on AY and © 
A, respectively; A,/E, and A,/E, are the reduiced algebras corresporiding to A, and A, 
respectively; and, @ is the isomorphism defined by the isomorphic equivalence ir A/E, 
and A,/E,. H, and H, are the homomorphisms induced by the congruenées E, on A, od 
E, on A, veqeeciivcly. The equation (+) defines the, set. of. many. to many mappings 
{¥p: Vi VpiD' € 4 uU { D }} relating A, and A,. In Appendix. II, we discuss for two 
behayjorally equivalent type algebras A, and A,, how a many - to, -many mapping 
¥py > Vp ts vi can be constructed from the set of many to many mappings { Yy | D'€A },. 
where a each D’ € A, ¥j is a many to many mapping from: Vy to Ne defined by 
behaviorally equivalent models A and A, of D’ defining vi, and v2, a. We 
also show that the above definition of behavioral equivalence indeed captures the desired 
property that the set of interpretations of a. ground term are ‘equivalent’ in behaviorally 
equivalent type algebras. 


- 49 - 


Thm. 2.5 For behaviorally equivalent algebras A, and A,,, for every ground term e of type 
D’ €(D) , for every red a, }, repens }-such that < [-v], [vp €)., and 
vice versa. 


Proof See Appendix II. 8 


The following theorem expresses that the distinguishability and observable 


equivalence of ground terms are invariant over behaviorally equivalent type algebras. 


Thm. 2.6 For behaviorally equivalent A, and A,, for any ground. terms e, and é, of type 
Dilla I= {lglg Wo flela l= flgla lk 
Proof See Appendix II. # 


{ [...] } stands for a set of equivalence classes. 
2.2.6 Definition of a Data Type 


The behavioral equivalence relation on type algebras abstracts their observable 
behavior as shown above and captures the meaning of a data type. 


Def. 2.14 A data type D is an equivalence class of. aged of type D defined by the: 
behavioral equivalence relation. 9 


Let My s stand for the set of all behaviorally equivalent algebras of type D. Every 
A in My is called a model of D as we have captured the semantics of the operations of D. 
The principal domain of a model A defines a value set of D. If a model i in Disa reduced 
algebra, then it is called a reduced model. Since isomorphically equivalent algebras have 
the same amount of nondeterminism, all | reduced models of D are either deterministic or all 
are nondeterministic (see P. 47). If a reduced model in D As nondeterministic, then the 
interpretation of an operation in every reduced model has, informally speaking, the same 
amount of nondeterminism. When we wish to present a data type D, we will do so by 
presenting an element of M, as the representative of Mm) We call this model the 
denotation of D. We often use a reduced model as the denotation of a data type. 


We can order algebras i in My using the onto homomorphism relation, ‘Given two 


- 50 - 


algebras A, and A, € M)), A, < A, if and only if A, is an onto-homomorphic image of A,, 
when A, and A, are considered 'in their. expanded form. The relation’ < can be shown to be 
a partial order. A reduced model A of D is the least model in M,, upto isomorphic 
equivalence. It is also called final in M,) because there is a onto homomorphism from 
every model A’ of D in My to A as depicted in the following dingrim. 
A’ 
| 
H' vy ¥=0.H 
-—>--A 
@ 


Def. 2.15 Set-Int is the set of all algebras behaviorally equivalent to A... 1 


So, A,,. A, ; a Ae - AS, . and A‘ are models of Set-Int):It can be verified that all 
models of asl are behaviorally eee type algebras of Bool. We wil use B as the 
denotation of Bool and A «i 28 the denotation of Set-int 

It should be clear from thé above definition that a data type D not having any 
observers consists of all type algebras.of D.. This is so. because the definition of behavioral 
equivalence of type algebras depends only on the behavior,of:the observers. __ 

We now compare our definition of a data type with those of Zilles [77] and the 
ADJ group [23]. They require a data type to be a set of all ‘isomorphic ‘{isomorphically 
equivalent to be exact) type algebras, which abstracts only the representation details from 
the algebras. (They assume that a data type has only deterministic operations). In their 
approach, a data type whose models are the reduced algebras i is distinct from another data 
type whose models have distinct observably equivalent values even though both data types 
have the same observable behavior. For example, ‘the data type consisting of models 
isomorphically equivalent to Ay; would be different from the data type consisting of 
models isomorphical]ly equivalent to Al, _ From a programmer’ s point of view, both the 
data types are the same and cannot be ditinguisl We do not understand the motivation 
for making the above distinction. Our definition ofa data type. is stronger than theirs, and 
it does not make the above distinction. It not only abstracts from the e Sprese alanis of the 


-51- 


values in a type algebra, but it also considers representations to be distinguishable only if | 
they can be distinguished by the operations. It is based on the programming language view 


of a data type. 
2.2.7 Observable Equivalence and Distinguishability of Terms 


Since every value in the value set V,) defined by a model A of D is an 
interpretation of some ground term of type D, the observable equivalence relation and 
distinguishability relation on V,) induce the observable equivalence relation and 
distinguishability relation on the ground terms of type D as follows: 

Two ground terms e, and e, of type D are observably equivalent w.r.t. A if and only if the 
possible interpretations of e, in A are observably equivalent to the possible interpretations of 
e, in A. And, @ and é, are distinguishable w.r.t. A iff they are not observably equivalent 


wort. A. 


For example, the ground terms Insert(Insert(Null, 2), 3) and Insert(Insert(Null, 1), 2) of 
type Set-Int are distinguishable wrt. A,» aS their interpretations {2, 3} and {1, 2} in Ag; 
are distinguishable, whereas Insert(Insert(Null, 2),3) and Insert(Insert(Null, 3), 2) are 
observably equivalent w.r.t. A me because they have the same interpretation in A si: The 
observable equivalence and distinguishability relations on ground terms of D w.r.t. A have 
the properties of the observable equivalence and distinguishability relations on Vp in A; 
remarks and observations made in Subsection 2.2.4 hold for them also. 

Using the fact that all models of D are behaviorally equivalent and Theorem 2.6, 
it can be shown that every model of D induces the same observable equivalence relation on 
the ground terms of D. So we can say that the above relations are independent of a model 
and are relations on ground terms of D. We can use a reduced model to derive the 
observable equivalence relation on the ground terms of D. 

Distinguishability and observable equivalence of the ground terms of D are useful 
in understanding the behavior of D. These relations characterize the behavior of D in the 
same way as these relations on the values of a type algebra characterize the behavior of the 


type algebra. Distinguishability captures the informal notion of the ground terms being 


- 52 - 


unequal. The models of a data type also induce observable equivalence and 
distinguishability relations on ground terms of type D' € A involving the operations of D in 
the same way as above. Understanding of the observable equivalence relation on the 
ground terms is helpful in writing a specification of a data type, as discussed in the next 
chapter. A specification of a data type can be viewed as a way to describe the observable 
equivalence relations on ground terms. 

We can also define the observable equivalence relation on terms (possibly 
involving variables) as follows: 

Given terms e, and e, of type D' € A’, let X be the set of variables in e, and e,; e, and e, 
are observably equivalent if and only if for some A € My, for every A-instance V of X, the 
possible interpretations of e[X/V] in A are observably equivalent to the possible 
interpretations of eX. /V\in A. And, e, and e, are distinguishable if and only if they are not 


observably equivalent. 


253s 


2.3 Exceptional Behavior of a Data Type 


So far we have assumed that every operation of a data type D returns a normal 
value of its range type for any input in its domain. This assumption is not realistic, as it 
glosses over an important component of the behavior of.a data type. In this section, we 
discuss the exceptional behavior of a data type. We relax the constraint that every 
operation terminates normally: An operation can terminate either normally by returning a 
value or by signalling an exception. For example, we modify the behavior of the operation 
Choose on the empty set; henceforth, we assume that it signals an exception instead of 
returning the integer 0. We discuss the assumptions made in the formalism about the 
behavior of the exception handling mechanism of a host programming language supporting 
the abstract data type mechanism. We extend the formalism introduced in the previous 


section to model the exceptional behavior. 
2.3.1 Assumptions about Exception Handling Mechanism 


We consider the exception handling mechanism an integral component of a host 
programming language supporting the data type facility. The exception handling 
mechanism performs two functions: Signalling the exceptions and handling the exceptions 
[52]. Signalling is the way a program notifies its caller of an exceptional condition, and 
handling is the way the caller responds to such a notification. A module implementing a 
data type must provide an adequate interface with the rest of the programming language 
for exception handling. Such an interface can be designed by naming the exceptions 
signalled by the operations along with the specification of information carried as arguments 
to the exception handlers. We will not be concerned with the semantics of the exceptional 
handling mechanism of a programming language in this thesis; we rather consider the 
exceptional handling mechanism insofar as it interacts with the data type mechanism. 

Liskov and Snyder [50] discuss two models of structured exception handling - the 
resumption model and the termination model. In the resumption model, it is possible to 
resume the operation invocation signalling an exception after the exception has been 
handled. In the termination model, the operation invocation is assumed to be completed 


54 


once it signals an exception. Liskov and Snyder. describe many advantages of the | 
termination model over the resumption model. In particular, the behavior of the handlers 
for the exceptions signalled by an operation is separated from the behavior of the operation 
in the termination model approach; this maintains the modular structure of the operations. 
In the resumption model, on the other hand, the behavior of the handlefs becomes a part of 
the operation behavior. Though there is not sufficient experience to suggest which among 
the two models is better suited for abstract data types, we have decided to adopt the 
termination model approach because of its simplicity. 
| In a language supporting call-by-name argument passing nee (or in fact, 
any mechanism in which the argument evaluation takes place inside the procedure body), it 
is possible to implement a data type whose operations can‘handle the exceptions signalled 
by the evaluation of their arguments. Few recently designed programming languages 
support such an argument passing mechanism for ‘at Wéast ‘two’ reasons: €i) Its-semantics is 
quite complex, and (ii) it is inefficient to implement. Most programming languages 
support call-by-value, call- -by-object 52], or call- “by-reference “mechanism; with these 
mechanisms, it is not possible to implement a data type having an operation that handles 
exceptions signalled by the evaluation of its arguments. We assume in our work that an 
operation does not handle any exception Signalled. by the evaluation of its arguments, 
rather such exceptions are handled in a module in which the operation is. invoked, as 
arguments are evaluated inside this module, Every operation is assumed to expect normal 
values as arguments.!¢ - oe 
If an operation takes sultiple arguments, many arguments may signal exceptions. 
The order in which the exceptions are signalled and handled depends upon the evaluation 
order of the arguments of a procedure invocation in the host programming language; we do 
not address this issue in the thesis, We would like our formalism to be com patible with any 
reasonable ordering scheme adopted in the host programming, language. 


10. However, our approach for defining a data type is. gencral and ficxibic enough to modcl a data type 
having opcrations that handle exceptions signalled by its arguments. We simply have to extcnd the formalism — 
proposed in this section. A data type with such behavior can also be specified by extending the specification 
‘language to be proposed in the next chapter. 


- 55 - 


We adopt CLU’s view of a data type that the handlers ‘associated: with the - 
exceptions signalled by the operations of a data type are not a part of the data type. This 
view keeps the behavior of the handlers separate from the type behavior and maintains the - 
- modular structure of the type mechanism. A user of a data type has the flexibility of 
associating different handlers for an exception in different contexts. We will not discuss 
the behavior of the handlers in our research. _ 

Exceptions signalled by the operations are distinguished by naming ‘them. An: 
exception can carry information as its arguments from the place where the ‘exception is 
signalled, and this information can be used by a handler associated with the signalled 
exception. An SReaEOn can signal many exceptions to exhibit different properties of an 
input. 

For illustration, we consider the data type bounded stack of integers, of size < 100, 
denoted by Stk-Int-100. Stk-Int-100 is an instantiation of the parameterized stack example 
in BIL it has the e following operations: . . 


Nuil sinsnb asoiae ates ae Saas 

Push _ inserts a givca integer i at the end of a given stack s. It signals the cxception — 
overflow(s, i) if the given stack is of size > 100. A handler for overflow may examine 
the stack and remove the useless elements to > make sca for the few element; or it 
may do something else. 

Pop removes the last integer inserted into a given-nonempty. stack s. When invoked on the 
empty stack, it returns the empty stack back. | 

Top returns the last integer inserted into a given nonempty stack s. It signals the exception 
‘no-top() if s is empty. No-top does not take any argument. 


Replace replaces the last integer inserted into a given nonempty stack s by a Biven integer i it 


signals the exception can’t-repidce !) on thé empty stack. ° 
Empty _ tests whether a given stack is empty or not. 


For Stk-Int-100, A = { Int, Bool } and @ = { Null, Push, Pop, Top, Replace, Empty }. 


2.3.2 Formalism 


“We discuss extensions of the formalism introduced it in the previous section to 
model the exceptional behavior of the aperations. . We discuss modifications to the 
definitions and their implications. Some important, definitions will be fully presented. The 
discussion and results of Section 2.2 are applicable once ‘these ‘modifications are 
incorporated. 

We first extend the definition of a type algebra given in | Subsection 2 2.1. We 
want to keep the normal values of every data type separate from the exceptions, | because 
the exceptions have totally different behavior as compared to the formal values, and 
because the exceptions should not be typed. In addition to a domain corresponding. to 
every D’ € A’ containing the normal values of D, a modified type algebra has anew domain 

of exceptions. denoted as EXV. EXV consists of all exceptions (or exception values) 
signalled by the operations of D” € (D), where for every exception. name ex of arity 
D,x ... XD., and each y, of type D, exv,, ..., _) is called an exception value. The 
a domain EXV in a type_algebra, A. of.D.is-specified incrementally. EXY-in A 
inherits the exception domain ofa model A’ of D' € & whose’ principal domain Vp as) ‘being 
used i in A. The exception values signalled by the functions inéerpreting the operations of D 
are explicitly specified. Let exv stand for an exception valug-ex(¥yiin-s ¥, " If an operation o 
signals, this is modéled as its interpretation f, ‘feturhing an element of EXV. 


We now present the modified type algebra: 


Def. 2.16 An algebra A of type D is a heterogeneous algebra 
[iV |D' € a’ }, EXV:, 44, Ieca} oe 


(i) _ for every defining type D’€ A, V,y is a value set of D’ defined by a model 
of Dt VW consists only of the ea values returned by the constructors 
of D’, 
(ii) EXV is the exception domain including the exception domain of a model 
of D’ defining Vy for each D' € A, and the exception values signalled by 
the operations of D, 


-57- 


(iii) for every o € Q, its interpretation f, is a total. function of the appropriate 

. arity. If D’ is the range of o,f either results in a normal value in V,y or 
returns an exception value. if any argument to f, is in EXV, f, is not 
defined on these arguments, !! and = . x 

(iv) Vj is the smallest set closed under finitely many applications of the 

_ functions'corresponding to the constructors of D (ie, { fl o€ f. )). Vp 

only contains the normal values resulting from the constructors. 8 =: 

| Recall that by assumption, even if f, is nondeterministic, it bohaves deterministically on an 

input on which it signals.’ “We assume that for every D’ €A,, itis possible to distinguish the 


normal values from the: exceptions; this assumption i is implicit in every programming 


language supporting exception handling. 


2.3.2.1 Terms, peony Teams, aie baa dd okeertiack 
fe oye 


In addition to terms as defined in Sue 2. 23, we Pave exception terms 
defined as follows. 


Def. 217 For every exception name ex of arity D, x ... xX D.,-exfe,; :.-, ¢) isan 
exception term if each ¢ is a term-of type D,. . hoy PSd 38 ne 3 


An exception term not having any variables is called a ground exception term. 

An interpretation of a ground term ein a type algebra his is ‘not t defined if any of 
its subterms intérpréts to an exception value. ‘So, ‘Proposition Din Subsection 223 gets 
modified to - 7 ea 


11. An equivalent interpretation is to have f . signal a distiviguished exception value, say abort() for example. 
We have not closes this interpretation: beeause it gives the impresiion of the cicoption valuebeing passed as 
an argument to the operation. If we wish. to madel a. data ‘ype, with an operation , handling exceptions, 
signalled by the evaluation of its argumenits, we-cantiot ‘make the dbove ‘assumption: “An operation o could 
return normal values even: when its arguments signal exceptions, sof, could: return a normal value in ‘that 


case. 


Prop. 2.3 An interpretation of a found term of type D’ € oy in an algebra A of type D 


is eithera 1 normal value, an sxcéption value, or undefined. . 


If an interoeiatod of e is an cception: wala OF -K6., “undefined, then ¢:has a unique 
interpretation in A. An interpretation of an instantiated term as well as a term in A are 
similarly defined. Proposition 2.2 in: ‘Subsection. 2.2.3: ee extends, toa. modified type 
algebra. 

An interpretation of an exception ground term ee, vanes € ey) | in Ai is defined only if 
each x2 A ‘isa normal value of type D; then, exe, = en: A = “ex(e|| Ae el a): 
Otherwise, exe, ee | A is undefined. The definition of an interpretation of an 
; instantiated exception term and of an exception term j in 1A, can be siven using the above 
definitions. 


gag 8, hE eae a Bg eS 


2.3.2.2 Examples of Modified Type Lal 


The type algebras A,; and rN of Set-Int pen in Subsection 2 z 2, pre modified 
to incorporate the exceptions. We ie use the ee A, , and Ai, oO ‘stand for the 
aia ease : ef: ep PT SG 

= [{S, Z, B}, EXV: { Nu, In, Re, Se OK 
The Cisse operation signals the exception no-element, which is included in EXY; a 
4 Coa) a no-element(), - =f 
bated of 0. Otherwise, the Aefinitions 0 of the funetions « remain the s same. _ Similarly, for 
we oe . 
At, =[{SQ', Z,B}, EXV; { Nu’, In’, Re’, Ha’, Si’, Ch’ }], 
where Ch'(<>) a no-element(), and the definitions of other functions remain the same. 
We Present a type algebra A,,, of Stk-Int-100._ 


As, 


tte aia: - 


= [{SQ’, Z,B}, EXV; { Nu, Pu, Po, To, Re,-Em}}, . 
rag e and B are the value sets defined by. the models of lot and Boot respectively And, 
SQ’ is the set of all sequences of i integers of length < 100, 7 ee 

SQ = 0, <0>, <1, <-D, <D, <-2>, 0; >, <0, D, @; », as 
The interpretations of the operation names are defined as follows: 


- 59- 


Nu & © 
Pu(<i,...,4.,) ie a ifm > 100 
_ Siyeeesdy D> otherwise 
Po(<i,---54,) 2CO ifm = 0 
Ci sees ts? otherwise 
To{<i,,...,4>) = ; no-top0 ifm = 0 
i otherwise 
Re{<i,,...,/ >.) 4 can't-replace(i) ifm '= 0 
| . i,---,4, i> otherwise 
Em(<i,...,4.>) © t ifm = 0 | 
F otherwise. 


Henceforth, by a type algebra, we mean a modified type algebra unless stated otherwise. 
2.3.2.3 Observable Behavior and Distinguishability 


The. ‘definition of Bool given in Subsection 2.2.4 remains: the same, because no 
boolean operation signals. 

As was stated earlier, if the operations of a data type exhibit exceptional behavior, 
its values can also be distinguished due to its exceptional behavior. If a sequence of 
operations signals:an exception on one value and. does not signaton the other, then the two 
values are distinguishable. If a sequetice of épérations signals ot both values, the two 
values are distinguishable if the sequence signals different exceptions. Thus the behavior 
of the values ofa data type can also be observed using the exception handling mechanism 
of the host programming language. Even if a data type does not have any defining types, 
its values can be distinguished if its operations signal exceptions. 

We define the distinguishability relation on V,, arid the distinguishability relation 
on the exception domain EXV in A mutually recursively, using the ‘distinguishability 
relations on the. domains corresponding to the. defining types: lt should be made sure that 
arguments to exception names are-such that the two definitions are well founded. The 
definition of distinguishability on exception values. incorporates that (i) two exceptions 


-60- 


having different names are distinguishable, and (ii) two exceptions having the same name. 
but distinguishable arguments are cetinguistetie. 


Def 2.18 Given two exception values ex 0; syees VD and ex,(v,,...,¥') in EXY, they are 
distinguishable iff (i) ex, # ex,, OF (ii) if ex, = = ex, and m = = ni, then-for some ce < igm,v, 
is distinguishable from Vs Two exception valles are observably ee iff me are not 
distinguishable. §. . 


We denote the observable equivalence relation on EXV by Egyy: 


Def. 2.19 For an algebra A of type D having no defining types. and whose Crenons do 


not signal, all values in Vp are observably equivalent. | 


Def 220 Two:normal values v, and v, in V,, of an algebra A of type D are distinguishable 
iff there exists a term with one variable of type D, expressed as c(x), such that one of the 
following conditions holds: ‘s 


(i) - the instantiated terms c{x/v,} and c[x/v,] imterpret to apna 
exception values in A, 


ii) c[x/v,} interprets to'-a normal value and. elata) i siteeprees ‘to an 
exception value or vice versa, and 
(iii) c[x/v. Ha -and c{x/v,}] gare normal. allies ng 4{ clx/v a } is 
distinguishable from { c[x/v,]], }. 8 
Note that in the above definition of distinguishability, we hiave not included the case in 
which exactly one of dx/v,] and dx/v,} is not defined because the condition Gi) above 
takes care of it. . 


Def. 221 Two normal values v, and vy, are observably ete iff they are not 
distinguishable. § 


- Theorem 2.1 of Subsection 2.2.4 extends to the above definition of observable 
equivalence relation. E,yy is also an equivalence relation. 
We extend the definitions of congruerice, homomorphism, and isomorphism: for 


-§1- 


type algebras having exception domains. The mappings from the normal domains of a type 
algebra A, to the corresponding normal domains of another type algebra A, induce a 
mapping ®p-yy from the exception domain EXV, in A, to the exception domain EXV, in 
A,. The exception names act like operations; they preserve these mappings. Given 

A) =[{ VID’ €4'}, EXV, {Pf |oe oF] ; 

A, =[{V2.[D'€4'}, EXV, {Plo €a}], 
for every exception name ex ofarityD, x..xD, 

Cex(y,, v), ex(y, (v,), x) p ©) > € Sexy: | 

Theorem 2.2 modified ie that E = { EF, | D’ € A’ } u { E, xv } is the ae 
congruence in A holds; the proof is similar to the proof of ‘Thesrem: 2. 2. E captures the 
normal as well as the exceptional behavior of the functions of a type algebra A. 

We define a reduced algebra in the same way as in Subsection 2.2.4 using the 
congruence E. The definition of behavioral equivalence relation on type algebras is the 
same as in Subsection 2.2.5. The definition of isomorphic equivalence used in the 
definition of behavioral equivalence is extended by including the mapping Pryy in the 
family © and requiring ®rxy also to be a bijection. The theorems of Subsection 2.2.5 
exhibiting that the definition of behavioral equivalence of unmodified type algebras indeed 
captures the desired intuition extend to the modified type algebras. ‘The results and proofs 
are modified to incorporate the fact a ground term e€ (respectively, an instantiated term 
AX/ ) ‘May interpret to a normal value, an exception value, or be undefined (see . 
Appendix In. , 

A data type D is defined in the same _way as in Subsection 2.2.6: as a set of 
behaviorally equivalent type algebras. Let My ‘stand for this set. “Every model in Mi now 
has the exception domain EXV. The observable equivalence and distinguishability 
relations on the ground terms of type D are defined as in Subsection 2. 2. T. We incorporate 

“the facts that two ground terms whose interpretation in every model i in M, D “are undefi ned, 
are observably equivalent, and that if one of the ground terms has an undefined 


interpretation whereas the other does not; then the two ground terms are disti nguishable. 


= 62 * 


2.3.2.4 pompeneae with Goguen’s Approach 


Our approach | is smile to Goguen’ s approach 0, a of modeling, the. 
exceptional behavior of a data type in the sense that exceptions are named and can have 
arguments. However, there are erycial differences j mn the. two design philosophies. In 
Goguen’s approach, the definition of a new data type. ean possibly extend the d¢finitions of 
its defining types. This is so because the exceptions (called not-ok values i in [20}) are typed 
just like the normal values (called ok values. in Pop. Instead of having a a single. domain of 
exceptions, Goguen partitions a value ' set of D. into ‘the exception values and the normal, 
values; _the exception value ‘part of the value’ set ‘expands as new types using D are defined. 
For example, the definition of Stk- Int-100 would extend the definition of Int by defining a 
new. integer no-top (which i is a not-ok value). ‘We consider this as s violating ihe. modular 
structure of the definitions. 7 . 

The OB). language, of f Goguen and Tardo eu allows the handlers for the 


So RES 


thys making the type behavior complex. Wes suspect ‘that they adopt this approach because 
of their attempt to develop the algebraic, semantics of a complet rogramming language 


OP G35 See PA? 


handling mechanism from the deta type. _ 
In contrast, we have concentrated 0 on n the behavior of data types only. “We have 


ayy Tell g 


separated the exception handling mechanism from the. data type mechanism, We have 
gnly. oa componsny of the oo Randlipg hs ed related, to the type 


fa 


a data type for rx reasons ns discussed earlier, we believe that the type mechanism Should only 
provide an adequate interface. to the “exception, handling” mechanism of the host 
programming, Jangpage, We separate the exception domain fic from the domain of normal 
values as exceptions have different behavior trom the ‘normal values. We do not type 
exceptions either because doing s0 seems ‘meaningless. In ‘this way, we have been able to 
define the behavior of the operations of a ‘data (ype ‘completely and ‘uniformly, ‘without 
extending the definition of any of its defining types thus preserving the modular structure 


of the type mechanism. 


2.3.3 A Simpler Approach 


In this subsection, we discuss another approach for modeling the ‘exception 
behavior of a data type, which is simpler ‘than the approach discussed earlier. This 
approach has been generally assumed in the literature ‘om algebraic specification of data 
types when the authors do not wish to discuss the exception: behavior of the dperations 
[29, 77]. The ADJ group's work [23] is an attempt to formalize it; att Guttag [31] embeds it 
in a rich way in a specification language. We discuss this approach for,two: reasons: (i) our 
discussion is simpler and more natural than that of 23}, (ii}our discussion would-place the 
works of those who have implicitly or explicitly assumed this approach of modeling 
exceptional behavior on a:firm basis, and (iii) our discussion:prqvides a semantic: basis of 
Guttag’s specification language... 

In this approach,:exceptions signalled by operations having the same range are 
not distinguished and no information is passed with ant exception to its handler. An 
operation on an input either returns a normal value, or signaly. an exception failure. For 
example, the operations. Push, Pop, and Replace signal the same exception failure. Evcry 
_ Operation is assumed to expect normal vahies as arguments. ifan argument to:an operation 
signals failure, then the operation prepagates it by signaling it. : | 

Such ‘exceptional behavior of the operations can be modeled by extending the - 
domain of every D' € A’ in an algebra A of type D (as defined i in Subsection 22. 1) with a 
special exception failure : we denote it by erty’. Whenever an operation 6 , signals failure, | 
its interpretation f, in A returns ery. , where D' is the range type of a. So we have 

= [EVpU Lorry HU LV, UL erty }ID'€ 4}; {f, loca} }. 
If any ofthe x's is erty: then f 5 6 soar x) = erty, ie. f, is. strict with segpees to its 
arguments. We assume that for every D' € A’, it-is possible to distinguish between the 
normal values and the exception value erty - 

We modify the definition of Bool given in Section 2.2. The. model B of Bool is 
extended to have the exceptional value. erry. 

= ({{ true, tatee, ect, 3 {TF Vi~.A, SA ay: 
where the pai of the boolean operations: remains the same aca values, 


- 64>. 


Besides, every function is strict. Bool is defined as the set offal type algebras isomorphic to 
B. 
We discuss a type algebra A. of Stk- Int-108.. 
Asie = LSQ' UL erty Jed Z. By; [Nw Py Pe, Te, Re’, Em} 
where B' = BU{ err, }. | ae — 
Z = Luf{eu,}, Ge, cee ee ae 


Ne 2&0 | 
Pu(<i,,..- 519s i): a ce a ifm >-100 
oO EO es t D otherwise 
Po't<i,,...,1) 2 . eo) 
| ( Ukigeeesi >? otherwities = 
To(<i,:.-,6>) Ber, iim SO 
EE eee 
Re(<i,,..-54,9,) * rae oe ifm @ 
is a Be AT es ESD otherwise 
Em'(€i,...,4) ee afm =O 
- F or otherwise. 


The theory discussed in. Section 2 2 directly extends to the above algebras also. 
The definition of the interpretation of a term in Subsection 2. 2. 3 easily extends. ‘A ground 
term of type D' “or an instantiated term ‘may interpret to ott as The ‘definition of iz 
distinguishability of values of D in a type algebra also extends ina straightforward manner. 
We want to add to the definition that (i) every’ notniat’ value of Di is distingeishable from 
the exceptional‘ ‘value ern. and (ii). two-nopmal. values v,:dnd.», in ‘Vy. of -A are: also 
distinguishable if there is.a term ¢(x) such that c[xty i interprets to an eee value, 
whereas c [x/ v,] interprets to a normal value, or.vice:- versa. - . 
The behavioral equivalence relation on modified: type algebras is a ssp 
extension of the definition given in Subsection-2.25.'.. The . modified. definition. of © 
isomorphic equivalence reqtfires that every mapping @py-it  maps:the exception value 
erry in A, to err, in A,. Other conditions remain thesame in the definition. A data type 


-6§- 


D is a set consisting of all behaviorally equivalent type algebras of the above kind. The 
observable equivalence and distinguishability relations on ground terms are defined in the 


same way as in Subsection 2.2.7. 


a 


2.4 Mutually Recursive Data Types . 


We have assumed so far that data types can be designed, hierarchically one at a 
time and that the data types on which a data type D ‘depends can be designed 
independently of D. These assumptions are not valid for a-subclass of data types. In some 
cases, it may be more meaningful to associate an operation with a collection of data types, 
_ instead of a single data type; for example the conversion operations between the data types 
fixed point number and floating point number. Or a group of data types may be mutually 
dependent such that they cannot be defined one at a time, for example, data types picture, 
contents, component, and view in [32] are mutually recursive. In the latter case, the 
dependency relation on data types as defined in Section 2.1 will have cycles. 

For the above cases, we consider groups of mutually recursive data types together 
as one entity, and define direct dependency and dependency relation on such groups and 
nonrecursive data types in an analogous manner so that the relations do not have any 
cycles. A group of mutually recursive data types can be then defined hierarchically when 
considered as one entity. 

Let D stand for a group of new types being defined together. Let A stand for the 
set of their defining types, assumed to be defined elsewhere, and Q stand for the set of their - 
operation names. 

A type algebra for a group of new data types D is a straightforward extension of a 
type algebra for a single data type D. It has a‘domain corresponding to every D € D in ? 


addition to the domains corresponding to every defining type D’ € A and the exception __ 


domain EXV. It also has a total function (deterministic or nondeterministic) corresponding 
to every operation name in Q. Instead of having a single principal domain as in case of a — 
type algebra for a single data type, we have many distinguished domains in a type algebra 
for D: Every domain corresponding to D € D is a distinguished domain. In order for the 
distinguished domains to be nonempty, it is necessary that at least one of the data types in 
D has a basic constructor (a constructor that does not take any argument of a type in D). 
Furthermore, all the distinguished domains must be constructible mutual recursively. 
The theory developed for a single data type easily extends to a group of mutually 
recursive data types. We can directly extend the definition of the interpretation of a term 


- 67- 


in a type algebra defined above. The observable equivalence and distinguistiability 
relations can be similarly defined on V,, for each De D. They induce the observable 
equiv alence and distinguishabitity amine on the ground terms of type D. Behavioral 
equivalence relation on type algebras can also be defined analogously. 

A group of mutually recursive data types ‘D i is a set of all behaviorally equivalent 
type algebras of the above kind. Every type algebra i in the equivalence class is a model of 
D. A model of D defines a value set of each DE D, which is the distinguished domain 
corresponding to D in the model. 


3. Specification of. an Abstract Data Type . 


In this chapter, we e discuss a , method for specifying, abstract data types. ‘Like the 
definition method, the specification method i is hierarchical and modular. We describe a 
specification language j jn. which data types having nondeterministic operations and ‘having 
operations exhibiting exceptional behavior can be specified “The main ‘goal in designing 
the Janguage has. been to develop a good, notation ‘for expressing. the design of the. data 
component of programs. The specification language should be as flexible as possible to, 
enable a designer to conveniently express his/her intent. We ‘do not restrict a specification 
to specify a single data type only, instead a specification in general specifies a set of-related 
data types sharing a common behavior. A specification only expresses properties particular 
~. to the data type(s) being specified. Properties common to. all data types, for instance, the 
minimality property, are not specified. They are instead assumed in the semantics of the 
specification language. : 

Since a data type is a set of soplete its specification(s) must capture the properties 
common to these models. The specification must specify the syntactic structure as well as 
the observable behavior of these models. There can be many ways to do this. One way is 
to present a model that acts as a representative of the above set. For instance, the definition. 
of a denotation of a data type D can serve as its specification; as an example, the model A,, 
of Set-Int can serve as a specification of Set-Int. A data type is specified in this way in the 
model approach [3], which is briefly discussed in Section 1.2. This method has a 
disadvantage that since a particular representation of the values of the data type is used to 
specify the data type, there is a danger of the irrelevant properties of the model being 
associated with the data type. This shortcoming of the model approach can be 
circumvented by. choosing an appropriate semantics of the specification method as in [3]. 

Another way is to specify the properties that characterize the observable behavior 
of all models of a data type. We adopt this approach, which is called the axiomatic 
approach in Section 1.2. We specify the observable behavior as a finite set of properties of 
the operations of D. These properties are expressed abstractly without referring to any 
particular model of D and without assuming any particular representation of the values of 


- 69 - 


D. They are presented as first order formulas relating sequences of operations that return. 
observably equivalent values. The reasons for choosing the axiomatic approach are: 

(i) A theory of a data type can be directly developed from its axiomatic specification . 
without referring to any other domain of discourse, 

(ii) our work can be integrated with the work on the development of axiomatic systems 
for reasoning about control structures [17, 36} and the automation of the verification 
process, and 

(iii) the methodology for proving the correctness of an implementation of the data type 
with respect to its specification is simple and natutat for @ wide class of specifications. 

Instead of allowing arbitrary first order formulas, we restrict the axioms to be 
equations because : 

@ an equational specification is amenable for deducing the properties of a data type (see 
the next chapter, where the. proof theory of a data type is developed from its specification; 
also see Musser [60] for discussion of a theorem prover for equational specifications), — 

(ii) an equational specification is easier for a programmer to understand (see [29] for a 
discussion on viewing eq uational axioms as recursive prograths), 

(iii) certain desirable properties of specifications can be guaranteed by putting constraints 
on equations [28}, 

(iv) an equational specification has been found to be more suitable for peererns 
deriving an implementation of a data type {64, 68}; and 

(v) a model can be more easily constructed from-a equational specification than from a 
specification whose axioms use existential quantifiers [16]. 

Our specification language allows a specification to introduce a finite set of 
auxiliary functions to express the properties of the operations. An auxiliary function is not 
an operation of a data type; rather it is a helping function in a:specification. So it is a part 


= 10 - 


of a specification of a data type, and nota part. of the data type itself! The use of auxiliary 
functions in a specification is a necessity, because if axioms are restricted to be equations 
without auxiliary functions, many data types cannot be specified (2, 53, 71,43]? With the 
help of a finite set of auxiliary functions, one can. specify: using a finite set of equations, 

(i) any data type with a recursively enumerable (r.¢.} value: set and a: finite set. of total 
deterministic computable functions [28, 43}, and 

(ii) any data type that can be specified using a recursively enumerable set of equations, 
restricted conditional equations, or positive conditional equations [43]. _. 
In this sense, our specification language. is quite expressive. (For a- detailed discussion of 
the expressive. power of an equational language with. auxiliary functions ‘and how it 
compares with other algebraic languages for specifying data types, see [43}) Besides, we 
have found auxiliary functions convenient and useful .in expressing the properties of 
complex operations; the judicious choice of . auxiliary: fuactions . often results in 
specifications that-are relatively easier to write and understand as compared with equivalent 
specifications written without using the auxiliary functions.> yg 

We discuss the specification language in the first section. Different components 

of a specification. are described.. The semantics of.a specification is: given ‘in. the second 
section. It is defined to be a set of related data types sharing the common. behavior 
captured by the specification. Jn the third section, we state what it means for a-data type to 
be (precisely) specifiable by a specification; equivaleace: among specifications is. defined. 
The fourth section discusses the specification of the data type boolean. In: the fifth: section, 
we discuss two structural properties of a specification, consistency. and behavioral 


}. An auxiliary function should not bc confused with an internal‘ procedure necded in an implementation of 
a data type to implement its operations. (Chapter 5 discusses internal procedures.) An auxiliary function 
however serves the same purpose in a specification as an internal procedure in an implementation. It is not 
available to the users of a data type, and is used only for expressing and proving propertics of the data type 
from its specification. 

2. We conjectured in [43] that even if axioms are allowed to be conditional equations (restricted, positive, or 
unrestricted), there are many interesting data types that cannot be spccificd without auxiliary functions. 

3. Guttag [31] rightly compares the usc of auxiliary functions in a specification with the use of subroutine 
(procedure) abstraction while writing a complex picce of software. 


-1- 


completeness, expressed in terms of relationships ameng the set of data-types specified by 
the specification. The consistency property requires that a specification specifies at least 
one data type. The behavioral completeness’ property -requires that a specification 
completely specifies the observable behavior’of the operations on’ intended inputs: it rules 
out only intentional incompleteness in'a specification. In the sixth section, we compare our 
specification language with the works of Zilles [77], Guttag ét al. [29,31], the ADJ group 
[23],.Goguen {20], Burstall and Goguen [7], Goguen and'Tardo [21], and Nakajima et al. 
(62). ; Tea? te x patos, Suse de ae 


7 the section we discuss. a specification for snutually recursive data, types. 


3.1 LSperitieavon cones: 
The specificiion language hi has a ie isa syntactic unit, . called. a. specification m module | 
foe suck a specifi cation), which in gencral. specifies a set of, related, data types. We fi first . 


discuss, specifications of hicrarchically structured (nonrecursive) dasa, types; at the.end of . 


We will use a siggle: name. to stand. for, any of the data types specified. by. a Dh 
specification. We’ ‘may use the same name as the name of its. specification whenever it bs . 
possible ‘ ‘to disambiguate from the context whether a name refers to a data type. or its 
specification. When we consider more than one specification of a data type, we use 
different names for different specifications. Though a long name for a concept may convey 
information about the behavior of the concept, the long name can be inconvenient to use, 
so we allow abbreviations for long names to be introduced in a specification preceded by 
the symbol as. Let D stand for a type being specified by a specification S.. 

_A specification in general has four components: 

@ Operations, 

_ (ii) Auxiliary Functions, ie 
(iit) Restrictions, and _ 
(iv) Axioms, | | 


f 


The operations component “specifies the syntactic properties of D, and the restrictions 
component and ‘the axioms component specify its semantic properties. We illustrate 
different components of a specification using the specifications given in Figures 3.1 and 3.2. 


ee Figure 3. Lis a specification of Set-Int. Figure 3.2 is a specification ofa set Stk-Int of data 


. types; the data type Stk-Int-100 defined i in Chapter 2 is in this set. 
A specification i is hierarchically structured; it refers to the specifi ications of data 


ae : “ ~ types aaher than D assuming that these specifications are given elsewhere. Data types other | 


Peed : than D may have already been specified, or they will. be specified later. For example, the | 


. : Z “specification of Set-Int i in Figure 3.1 refers to a specification ofa data type Int. We assume 


- that Int is specified elsewhere. Since a specification of Int can speci a set of data- types, 


oe Inti in Figure 3. 1 stands for any data type in the set. 


= 43% 


Figure 3.1. Specification of Set-Int 
Operations 


Null : — Set-int | as B 
Insert : Set-int X int — Set-int 
Remove : Set-int X int — Set-int 


Has _: Set-Int X Int + Boot AS XL EX, 
Choose =: Set-int — Int nondeterministic 


—+ no-element() 

_ Restrictions 

#(s) = 0 = Choose(s) signals no-element: - 
Axioms 


Remove(,i) = © 

Removetinsert(s, i1),i2) = if it = i2 then Remove(s, i1):else Insert(Removels, i2), 11) 
i€¢@ =F ey ee hed 

i1 € Insert(s, i2) = it it = i2 then T efsei1 €s 

#(9) =O 

# (Insert(s, i)) = ifi€ sthen #(s) else #(s) +1 

Choose(s)€s = T 


Whenever we introduce a new construct ofa specification in this section, we 
informally discuss its meaning: for motivation and ‘clarity: of. exposition. As was. stated 
above, the precise semantics of a specification will be given in the next section. 


' 3.1.1 Operations 


This component specifies (i) the domain and range, and @ the names of the 
exceptions signalled by every operation of D on its intended inputs, along with the types of 
the arguments to the exceptions. It is a sequence of specifications of the following form: 


- 14 - 


o:D,x...xD =D Ls Byheas€ We stead 


— ex Ow ote -D,,) 


~ ex (D,. cee Din) ee pend oe 2 
where DX... XD. is the domain of @ and D' is its range,. » signals. exceptions having 
names ex,,..., ex, , whose argument types are also specified. If at opetation is specified 
to signal an exception, the exception must be listed in its syntactic specification. If « does 
not take any argument, then it is a constant of its range type. Tfan exception name ex does 
not take any argument, it is expressed as ex() or simply ex. The operations component of. a 
specification of D indirectly specifies the A and @of'D: © as 

When an abbreviation is introduced for an n-ary apeaien name, we can specify 
how the abbreviation distributes over the arguments using the argument place holders 
X,,.--.X,. For example, the operation Has of Set:Int is abbreviated to ° as and it is ‘used as 
‘x, € x,.’ We discuss later (Subsection 3.1.5) how nondeterministic operations are specified. 


3.1.2 Auxiliary Functions 


This component is optional; it exists if auxiliary functions are used in writing the 
Axioms and the Restrictions. As was discussed before, atixiliaty functions are introduced to 
enhance the expressive power of the specification language.and:to:make the.language more 
flexible so that specifications are.easiet to write and understaad.,. We: do not recommend | 
choosing auxiliary functions randomly. to express the behavies of the eperations. -Instead, 
they should be chosen with care. An auxiliary function should embody a subsidiary — 
procedural abstraction needed to express the operation behavior. ‘It is a‘good design 
practice to completely specify an auxiliary function even if i its behavior i is needed only for a 


subset of i its input domain. Furthermore, if an auxiliary function i is of the’ ‘result type D, it 
_ should not have to construct values that cannot be constructed by the constructors or D. 


-75- 
Every auxiliary function is deterministic, and there are no restrictions associated with it4 
For example, the specification of Stk-Int in Figure 3.2 uses the auxiliary function Size. 

We specify the domain and range of every auxiliary function used in the 
specification in the same way as the operations. Let A, stand for the set of all auxiliary 
functions used in a specification. An auxiliary function may use a data type not in A’ 
(= AUD) as a component of its domain or as its:fange? we call such a data type as an 
auxiliary type. Like a defining type, every auxiliary type is assumed to ‘be specified 
. elsewhere. Let A, stand for the set of auxiliary types used by the auxiliary. functions in. Ay. 
If a specification does not have the auxiliary functions component, then “A, = @ and 
A, = @. 

We extend the definition of a term in Subsection 2.2.3 to include terms 
constructed using the auxiliary functions and the eperation symbols of:the auxiliary types. 
Def. 3.1 An auxiliary term of type D’ € U (D")" is defined inductively as 

D’'e{D}u A, 
(i) a term of type D’, _ . 
(ii) if « € A, such that its domain is D, x ....x D_ and its range is D’, then ‘o(e, ,...,€,) 


is an auxiliary term of type D' if and only if each ¢ is an auxiffary term of type D. Pie 

Clearly, if A, and A, are the empty sets, the definitions of an auxiliary term and a term 
coincide. An auxiliary exception term can be defined by replacing terms by auxiliary terms 
in the definition of an exception term in Subsection 2.3.2. Henceforth, by: a term, we mean 


an auxiliary term, and by an exception term, we mean an auxiliary — term, unis 
stated otherwise. 


4. These constraints on auxiliary functions are imposed for convenience and simplicity. Our formalism . 
would work equally well if these constraints are not imposed. 


- 16 - 


Figure 3.2. Specification of Stk-Int 
Stk-Int as Stk — 
Operations 
Null =: — Stk | 
Push _ : Stk x Int + Stk 

eae — overfiow(Sti, int) 
Pop : Stk + Stk 
Top : Stk — Int . 

 =+'po-top) | 
Replace : Stk X int —. Stk 
Empty =: Stk — Bool 
Auxiliary Functions - 
Size : Stk — Int as’ # (x) 
Restrictions 


PretPop(s)) :: ~ Empty(s) 
PreReplace(s, i)) -: ~ Empty(s) 


Empty(s) =: Topte) Signals ne-top0 


Push(s, i) signals overs i) => #(s) > 100 | 


Axioms’ 


Pop(Push(s, i) = s 
Top(Pusti(s, i) = I 
‘Reptace(s, i) = Push(Pop(s), i) 
Empty(Null) = T 
Empty(Push(s, i)) = F 

# (Null) = 0 

# (Push(s, i)) = #(s) + 1 


3.1.3 Restrictions 


The restrictions and axioms components of a specification specify the normal as 
well as the exceptional behavior of the operations. They also. define the auxiliary functions, 
if any, used in the specification. The axioms component specifies the normal behavior of 
the operations. The exceptional behavior is specified as a separate layer over the normal 


behavior. This is achieved by specifying restrictions on the operations in the restrictions 


component. An axiom in the axioms component holds only, if the operations used in the. . 


axiom satisfy the specified restrictions. The restrictions component. iS. an extension of the 
Restrictions Specifications of Guttag [31]. : 
The restrictions component is a set of restrictions; every restriction is associated 
with an operation. There are two kinds of restrictions: 
(i) Preconditions, and | 
(ii) Exception Conditions. 


Every exception listed in the syntactic specification of an- operation should have an 
associated restriction specifying the input, condition when the exception is signalled or may 
be signalled by the operation. The boolean conditions inthe exception conditions for an 
operation must be disjoint. Another constraint on: the beolean conditions when they use 
nondeterministic operations is: discussed later. - As! is: stated: in the: first‘ chapter, for 
operations having complex behavior, it may: be very: difficult to-speeify conditions on their 
inputs under which they signal a particular exception. This approach of specifying the 
exceptional behavior is not suitable for such operations. - 


3.1.3.1 Preconditions 


The precondition restriction for an operation specifies the subset of its. input 
domain on which the operation behavior is of interest. The operation is expected to be 
invoked on inputs in this subset; it is the user's responsibility to ensure this. The operation 
behavior is specified only on these inputs; it is left unspecified on inputs outside the subset 
because it does not matter. An operation can either signal an exception or return a value 


on an input not satisfying the precondition. For example, in certain applications, we may 
i 


\ 


78 = 


not care how the operation Replace in Figure 3.2 behaves on the empty stack as it is never 
going to be invoked on the empty stack. It could either return a stack value or signal an 
exception. Also see [51,32] for more examples of such operations. [f a specification 
commits to a particular behavior on an input not satisfying the precondition, for instance 
signalling an exception, many implementations would be ruled out. Our approach is to 
encourage a designer to specify only that portion of the data type behavior which is of 
interest to him and allow the rest of the type behavior to be left unsperied so that an 
implementor has the maximum flexibility. 
| The precondition restriction for an operation 0 € Gis specified as: 
Pre(o(X)) :: POO, i 

where P(X) is a boolean term having Xy++++%, (the input X) as its variables, and it cannot 
signal on X. The axioms involving o hold only if the input to every invocation of o satisfies 
the precondition P(X). If the Restrictions component does not specify a precondition for 
an operation, the operation is assumed to be specified for its entire syntactic domain, ie., its 
precondition is T. For example, ~ Empty(s) is the precondition for Pop as well as Replace 
in the specification of Stk-Int in Figure 3.2. The specification: does not specify the behavior 
of these operations for the empty stack. No precondition is specified for any other 
operation, so their preconditions are. T. Similarly, no precondition is specified for any 
operation in the specification of Set-Int in Figure.3.1. Ifa precondition different from TT is 
specified for an operation o, «: is said to have a:nontrivial precondition. Let ae 
the precondition for o. 

If an operation o does not oo on an bala not ae jts precondition, it 
cannot return an arbitrary value. Ifo is a constructor, as for example, the operations Pop 
and Replace in Figure 3.2, the result must be constructible by the ‘constructors of D using 
inputs satisfying the associated preconditions. mien), ifo o is an asia then it must 
retum a value of its result type. 


-79- 


- 3.1.3.2 Exception Conditions 


There are two kinds of exception conditions: 
(i) Required exception conditions, and 
(ii) optional exception conditions. 


A required exception condition for an.operation:o is expressed as 
R(X) = o(X) signals exfe,,..., e). . 
stating that if the input X satisfies the precondition P, and the boolean condition RO, 
which is a boolean term, then the operation o must signal the exception ex having ¢,, ..., € 
as the arguments to its handler(s). The exception name ex is of arity D, X ... X D, . and 
each e is a term of type D. having variables only from the set {X,..5.%, }. For example, 
in Figure 3.1, the operation Choose is specified to signal the exception no-element on the 
empty set. In Figure 3.2, the operation Tep signals:mertep:on the empty.stack. We-call the 
above exception: condition required because~the operation is required. to ‘signal. the 
exception. It is possible to specify an operation signalling different exceptions for different 
subsets of inputs. ; | 
In certain applications, it may be restrictive to require that an operation signal an 
exception when its input satisfies a condition. At the same time, it may not be desirable to 
leave the operation behavior completely unspecified. Instead, we would like to place. 
constraints on the behavior. If an input to the operation satisfies the specified condition, 
the operation is specified to have the option of either signalling the specified exception or 
returning a normal value. In case the operation chooses not to signal, it must behave as 
specifi ied by the axioms. Optional exception conditions are introduced to capture such 
behavior of an operation. An optional exception condition is expressed as 
o(X) signals ee, ++ +5 = OC), 

stating that in case o ae an exception ex having Cys oor & aS arguments and the input XY 
satisfies the precondition Po. then the input x must also satisfy the boolean condition 
0), a boolean term. , a 

Optional exceptions are saiccially useful for specifying a set of similar data types : 
having values whose i aa (size) a. different upper bounds. Iti is ossible to state a size 


- 80- 


requirement on the values of the data type, but at the Sam time not'be very restrictive 
about the requirement. An implementor could decide on the exact bound based on 
convenience insofar as the specified bound condition i is met. ‘Such behavior of a data Spe 
is specified by stating that the constructors have the option to signal exceptions. | 

For example, in the data type Stk-Int-100 defi ned i in the previous chapter, the 
operation Push signals if its stack argument is of size 100; Ifthe tesived requirement is that 
a stack value be able to store at least 100 integers, this. behavior: of Pushiis very restrictive. 
It rules out a: implementation ‘supporting stack values of size > 100, even though’ the. 
implementation has the desired behavior except that Push does not signal exactly on stacks - 
of size 100, but rather on stacks of size 128.: We’ bpecify the desired requirement in 
Figure 3.2 by stating that Push optionatty signals; whenever Push Signats overflow, its stack — 
argument ‘must ‘be at:least:of size ‘100; In this way, a specification specifies the least upper 
bound on the size’ of the values:of a datatype; andithe responsibility of deciding the exact 
upper ‘bound is delegated: to‘an émpleméntor. < Sucti: a specifieation is: flexibte‘ and: not 


3.1.3.3 Discussion 


Note ‘that the nontrivial precondition restrictions and the optional exception 
conditions leave the specification of the operations incomplete because the operation 
behavior i is not completely specified on a subset of i inputs. An operation could behave on, . 
such inputs in any. way consistent with the specified behavior. ‘That i is why a specification 
in ‘general specifies a set of related data types; the operations of these data types. have the | 
same behavior for a subset of their syntactic domains, For ‘example, Stk-Int specifies data 
types having stack values whose size has different upper bo bounds > 100. The operations of 
these data types behave the same way on stacks of size s 100, except that Pop and Replace 
of different data types may behave differently on the. “empty “Stack, We call such 
incompleteness i ina specification as intentional incompleteness, in contrast to unintentional. 
incompleteness introduced because of the omission on the part of g a designer i in specifying 
~ the properties of the operations, 


It should be intuitively clear that ifn no jpauival oresoedinon and no 0 optional 


-8l- 


exception condition are associated with any operation, and the axioms completely capture 
the observable behavior of the operations, then a specification specifies a single data type in 
case the specification of every defining type also specifies a single data type. We elaborate 


this informal statement later in the chapter. 
3.1.4 Axioms 


This component specifies the normal behavior of the operations in and the 
auxiliary functions in A, if they are used in a specification. The behavior is specified as a 
finite set of equations of the form e,= eos where e, and e, are auxiliary terms of the same 
type; at least one of é, and e, must have its outermost symbol in QU A, , otherwise an 
equation would not be specifying a property of D. € = e, informally means that the 
sequences of operations expressed by the terms e, and e, have the same behavior, 1.e., when 
values are substituted for variables in e, and e,, the instantiated terms interpret to 
observably equivalent values. The symbol ‘=’ is interpreted as the observable equivalence 
relation. The equations attempt to capture the observable equivalence relations on ground 
terms defined by the data type(s) being specified, which is discussed in Chapter 2. 

If a specification does not have the restrictions component (i.e., the operations do 
not signal exceptions and there is no nontrivial precondition associated with any operation), 
then the variables in an axiom are universally quantified: Any value of the appropriate type 
can be freely substituted for a variable. 

If a specification has a restrictions component, then an axiom is interpreted in a 
different way; the variables in an axiom cannot be freely substituted. We must also 
consider the restrictions imposed on the operations appearing in the axioms. The values 
substituted for the variables must satisfy the following two conditions: | 

(i) For every operation o having a nontrivial precondition P, , the arguments to every 
invocation of « in the axiom must satisfy P, , and 
_ (ii) an instantiation of any subexpression in the axiom must not interpret to an exception 
value. . 
The condition (ii) above is equivalent to requiring that an interpretation of an instantiation 


of e, ore, is neither undefined nor an exception value. For example, consider the axiom 


- 83 - 


Replace(s, i) = Push(Pop(s), i) BOP 3 (*) 
in the specification of Stk-Int.in Figure 3.2. -It applies only for the values of s for which 
~ Empty(s) holds, which is. the precondition for both Replace arid Pop. Furthermore, Push 
must not signal overflow on the result returned by Pop; which it cannot in any case. The 
equations characterize the normal behavior of the operations in this way. 

It is often the case that two terms are observably equivalent only when a condition 
is placed on their variables; for example, in the second axiom in the specification of Set-Int 
in Figure 3.1, Remove(Insert(s, il), i2) i is observably equivalent to Insert(Remove(s, i2), il) 
only if i] and i2 are not equal. So, while writing the axioms, it is convenient to assume an 
auxiliary function if-then-else corresponding to every D €A'U A,- The definition of 
if-then-else is given as: 

if-then-else : Bool X D'X D'— D' as iff x, then x, else x, 
if T thenx elsey = 
if F thenx else y = 


Since these functions are used Gear they are assumed to be “implicitly defined 
whenever needed. They are not explicitly stated in the auxiliary functions.compenent of 
the specification, and are nat in A,. If Bool is not a defining type, then Bool is assumed to 
be an auxiliary type. An axiom of the form ‘e = if 5 then e stands for the equation. 
‘e, = ifthen-else(b, ¢,,¢,).’ We call ‘e, = if b then-e;' a conditional: equation.” It is 
equivalent in its interpretation to the formula ‘b = T»* e, = e,.’ An axiom of the form 
‘e, = if bthen e, elsee,,’ stands for the .equation ‘e,= ‘M-then-else(b, e €°E)- ‘It is 
equivalent to the following two conditional equations 

‘e, = if bthene,, 

‘e, = if~ bthene,,, 


5. Note that a conditional equation as defined above is different from a positive conditional equation of the 
ADJ [71], in which the condition in the axiom can be expressed using = positively. A conditional equation of 
~the above form is called a restricted conditional cquation in [43]. We have chosen such axioms because of 
simplicity, as even using positive conditional ¢quations'as axioms does not add to the cxpressive power of the 
specification language [43]. Furthermore, homomorphisms do not preserve positive conditional equations. 


-83- 


3.1.5 Specifying Nondeterministic Operations 


If an operation is nondeterministic, this is specified using the ‘symbol 
nondeterministic following its range specification, as for-the Choose operation of Set-Int in 
Figure 3.1. The behavior of a nondeterministic operation.is specified in.the same way as of 
a deterministic operation. The restrictions component may. specify a precondition, a set of 
required exception conditions, and a set of optional. exception conditions for a 
nondeterministic operation. For a nondeterministic . observer -Feturning ‘many possible. 
results on an input, the. axioms do not specify:.the results; instead, they specify the 
properties of the results. _ For example, the. axiom specifying the behavior of the 
nondeterministic operation Choose. of Set-Int on. an nonempty set s states that a result : 
returned by Choose on s must. be an. element of the sets. For a nondeterministic 
constructor, its behavior is.characterized by specifying the results.returned by the observers 
on the possible values constructed by it. _ Te a) 

If a boolean condition in a restriction is tet using spgiceaae: 
operations, we require that for every input X, the boolean. condition. behaves. 
deterministically, ie., it returns either T or F. It is, meaningless:for a: beolean condition.to 
return T as well as F on X: In case of a.precondition, the instantiated. boolean condition 
returning T as well as F would mean that.the input satisfies the.precondition. as, well.as:does . 
not satisfy the precondition. In case of an exception condition, this would megan. that. o. 
signals or may signal on the input as well as that. does not signal.on the input. 

_— For an equational axiom ‘e, = e, expressed using nondetgpministic.operations, we 
use the following interpretation: For an instantiation of the. variables in the. axiom.allowed 
by the preconditions and restrictions, the set of possible. values. returned. by the instantiated: 
e, is observably equivalent to the set of possible: values. returned by the instantiated e, (Le., 
for every choice of nondeterministic operations jn ¢,, the. value setursed by the instantiated 
e, is observably equivalent to a value returped by the instantiated e,.for some choice of | 
nondeterministic operations in G, and vice versa). We have rejected another possible 
interpretation which is that for any choice of nondeterministic operations in both e, and é,, 
the values returned by the instantiated « e, and é, ‘are observably equivalent, because under 
this interpretation, ‘the axiom does not “hold ‘when @ ‘and eg “exhibit. nondeterministic 


- 84- : 


behavior; an equational axiom thus does not express any useful property. If an axiom is a | 
conditional equation ‘e, = if 5 then e,, where the boolean condition 5 involves 
nondeterministic operations, then we require that for an instantiation of the variables Kisines 
x, b behaves deterministically. As in case of a -booléan condition in a restriction, an 
instantiation of 5 behaving nonce inna and returning Ta as well as does in not make 
any sense in a‘conditional equation. 

An alternate approach for specifying a nondeterministic operation would be to 
indirectly specify it by having the axioms specify its relation, which is deterministic. ‘The 
“telation can be specified using equations and conditional equations. However, the 
constraint that if the nondeterministic operation’ returns a normal value on an input, then 
the relation holds for the input and at least one result, cannot be expressed i in terms of - 
equations and conditional equations. This can’ be circumvented by assuming that every 
such relation satisfies the above constraint. - If'a nondéterministic operation signals on an 
input, some convention about the behavior of the rélation on such’ an input must be 
decided. Using this approach, it is ‘possible ‘to specify” the precise amount of 
nondeterminism an operation should have. However, we have adopted the former 
approach because of the following reasons: © pigs , 

(i) We do not want the specification to specify the precise amount of nondeterminism an 
operation should have: instead, we Teave this decision to the designer of an’ 
implementation, « cree | - 

(ii) it seems more natural to aces iia the behavior of'an alana than specifying 
the corresponding relation; 

(iii) the semantics of a specification designed using the latter approach would have to be 
derived indirectly, as should be evident from the discussion in the next section, and 

(iv) if we adopt the latter approach, the ‘hormal’ behavior of the nondeterministic 
operation would be indirectly specified by specifying’ its relation, whereas its exceptional _ 


behavior would be directly Specified: ‘We would’ like to avoid: using two notations for the 
same. concept : 


But one major advantage of adopting the latter approach i is that we : do not have to develop 
any additional formalism for nondeterministic operations. The theory developed. for. 


-85- 


specifications specifying only deterministic operations applies to nondeterministic 


operations also. 
3.1.6 Specification of Mutually Recursive Data Types 


A specification for mutually recursive data types is similar to a.specification for 
nonrecursive data types. Let.D stand for.an.instance of a group of mutually. recursive data 
types being specified. The specification is given either the name of some data type in D oF 
a name different from the names of data types in D. Like a specification of a nonrecursive- 
data type, it has four components: 

(i) Operations, 

- (ii) Auxiliary Functions, 
(ili) Restrictions, and 
(iv) Axioms. 


The Operations component specifies the syntactic properties of the operations of D. It is 
divided into subcomponents. There is a subcomponent entitled D corresponding to every 
data type D in D specifying the operations of D: ‘$6; a subcomponent is like the operations 
component of a nonrecursive data type as discussed above- Besides, there is another 
subcomponent entitled Combined Operations, which specifies the syntactic properties of the 
operations not belonging to any particular data type, but rather to the whole group D. The 
remaining three components are the same as in a specification of'a single data type. If D 
does not have any combined operations, the specifications of data types in D’can be given 
separately like nonrecursive data types. However, thé ‘semantics of these specifications 


must be given together. 

Henceforth, we discuss only nonrecursive data types. From the following 
discussion, it should be clear how fo extend the results and'the theory to mutually recursive 
data types. For instance, we can give the semantics of such a specification in a similar way — 
as for nonrecursive data types (discussed in the next enon) ida that ‘we will need to use 
type algebras defined in Section 2.4. oe ie aoe! 


- 86- 


3.2 Semantics of Specification Language __ 


The semantics of a specification S is defined to be a set of related data types. 
Each data type in the set is.said to be-specified:by.S, Let QS} stand for, this set, Since a 
specification S refers to other specifications assuming them to be given, for example, the 
specification of Set-Int refers to the specifications of Int’ahd: Bool; the semantics of S is 
given using their semantics. For a defining type D' € A used-in Si we assume ‘that D’ has a 
specification S’ having a nonempty set of data types as its semantics; ' D’ stands for any data 
type in D(S’). pa hs 

If S does not specify any Sead oe operation; then every data type in D{S) 
can be shown to be deterministic. Operations of different data types in OS) ‘share the 
common behavior specified by S. Different data types differ in the*wiy their operations 
behave on inputs not satisfying the preconditions specified for the operations and/or on 
inputs on which the operations are specified to have the option between signaffing and 
returning a value. If the axioms do not completely capture the. observable behavior of the 
operations, then data types in D(S) have operations inaving different behavior on input on 
which the axioms leave. their behavior unspecified. 

In case S specifies nondeterministic operations, then data types in D(S) also. differ 
in the amount of nondeterminism their operations have.. 2(S).has:data:types in. which the 
operations specified:to be nondeterministic are. deterministic as well. as data types in which 
such operations have. the maximum amount of nondeterminism. allowed by S. For 
example, the semantics of the specification of Set-Int given in Figure 3.1 has a data type in 
which the operation Choose is deterministic, returning the maximum integer. ina ngneaipty . 
set s passed as the argument to Choose. It also has the data type Set-at defined in the . 
_ Previous chapter in which the Cheese, nondeterministically. picks any element of s. In 
general, a data type in D(Set-Int) has. the operation. Choose: return an element from a: 
nonempty subset of s. 

The semantics.of a specification. saetiine: nondeterministic: operations is thus 
necessarily a set of data types differing in the amount of nendeterminism these operations .. 
have, even if the specification does not specify any precondition or any optional exception 
condition for the operations and the specification completely specifies the observable 


- 87 - 


behavior of the operations. This semantics of a specification is chosen because of our view 
that a specification should not constrain an implementation to have any precise amount of 
nondeterminism, and that the decision about how much nondeterminism an 
implementation should have, be left to the designer of the implementation. Since a 
specification serves as an interface between the programs using the data type and the 
implementation(s), every theorem derived from the specification, as discussed in the next 
chapter, must hold for a correct implementation when interpreted appropriately. 

It is possible to write a specification in our language which specifies unbounded 
~ nondeterminism. (The term unbounded nondeterminism used here is different from the 
way it is used in [13, 35].) For example, in the specification of Ny (a version of the data 
type natural number) in Figure 3.3 specifies unbounded nondeterminism because the 
operation Pick is specified to have unbounded nondeterminism. For such a specification 
there does not exist any data type having maximal amount of nondeterminism. We will 
precisely state the condition when a specification S specifies unbounded nondeterminism. 
For a specification specifying bounded nondeterminism, we define data types having 
maximal amount of nondeterminism allowed by the specification. 

Instead of giving the semantics of S directly in terms of data types, we give its 
semantics as a set of (well formed) type algebras. Let A(S) stand for this set. We then 
partition this set using the behavioral equivalence relation on type algebras and get the set 
D(S) of data types. Each type algebra in F(S) is a model of some data type specified by S. 
‘We first assume that S does not use any auxiliary functions, i.e., A, = 2 and A, = &. 


Later, we discuss the semantics of S assuming that Ay # @ and A, # D, 
3.2.1 Specifications without Auxiliary Functions 


A type algebra in FS) must have the syntactic structure as specified in the 
operations component of S and the observable behavior as specified by the axioms and the 
restrictions in S. F{S) is inductively defined; as in Chapter 2, we combine the basis and 
inductive steps into a single step. F(S) consists of all (well formed) type algebras of the 
form | 
A= [{V) [Die a'}, EXV > {f, |oea}] 


Figure 3.3. Specification of Ny 


Operations 
. 0 leer 4 Ny 

S : Ny _ Ny 

Pp ' Ny = Ny : 

— no-pred() 

zs : N; XN) — Bool as X, = X, 

> : Ny XN) + Bool | as X, > X 

Pick =: — Ny nondeterministic 
Restrictions 


x = O = P(x) signals no-pred() 
Axioms 


P(S(x)) = x 

K>x eT 

x>2z = if(x>yAy>2)then T 
Six} >x = T 

x > S{x) = F 

x > Sly) = #~x> ythen F 
xezy = (x>yAy>x) 
Pick) > 0 = T 


such that A satisfies the restrictions and the axioms in S, where for each D' € A, V,y is the 
principal domain of an algebra A’ € F{S'). A’ is a model of'a data type D’ in D(S’). 

We first discuss when a type algebra A satisfies restrictions; later we discuss the 
axioms. Let X = { x, ..., x, } Stanid-for all variables in an axiom or a restriction. Let 
V={ Vs eV }, where each v, is a normal value of the appropriate type, stand for a 


A-instance of X, i.e., each v, is an instance of x,- 
3.2.1.1 Restrictions 


If a nontrivial precondition P_, is specified for a constructor o, then on an input V 
such that P_-[X/¥] interprets to F, f,{v,, ., v,) either signals or returns a value 
constructible by the constructor functions using arguments satisfying their preconditions. 
It would be meaningless to allow f. to retum an arbitrary value that cannot even be 


- 99 - 


constructed. For example, if a data type satisfying the specification in Figure 3.2 has its 
Push operation signal evesflow on stacks of size 128, itis absurd to let the operation Pop 
return a stack of size 1000 when applied.cn:the:empty stack, the input that does not satisfy 
the precondition specified for Pop. Similarly, if o is an observer, then-£,9,,...3 v,) either 
signals or returns a value in V,,, where D’ is the result type of os: e 
If the restrictions component specifies a required the condition on a as 
R(X) = o(X), signals exfe,,....@)., 


then for every V, if both on [X/ Al and R [x/ y Linerpeces to T. then 1) must signal the 


exception value ex(e, [x 7 7) Ae Se [X/ Vil _,) for A to.satisfy the above restriction. — 

If the restrictions component specifies o to optionally signal an, exception, i.€., 

o(X) signals ex(e,,. or e) = => O(X),. 4 

then for every V such that P[X/V] interprets to T aad f W sails the eecbplion ex with 
the interpretations of ¢, [X/V],...,e, [X/V] as arguments to its handlers, 0 [X7V] must 
interpret to T for A to satisfy the above restriction. uae 

Since the restrictions are assumed to completely specify the eaoqiional behavior 
of the operations, for every operation o, the interpretation f, in A. must be such that 
f,(v,---»¥,).é8 a normal value if (i) P,[X/V] holds, (ji) none of RIX/V] holds, and (ii) 
none of QIx/ V\ holds. 


3.2.1.2 Axioms 


A (behaviorally) satisfies an equation ‘ es = 6, *(or* e= =e, ’ holdsin ») if and aise if 
for every V, one of the following conditions holds: 


(i) The instantiation of e, or of e, interprets to an exception or is undefined, | 

(ii) _the input to an invocation of some f, On v‘... .v! does not satisfy the 
precondition associated with ¢ fi.e:,-P..(v\;...., v.) interprets to F) when’ 
the instantiations of e and e, are interpreted, and 


(iii) {e, [xy VI atis observably equivalent to fe e [X/ V\I A a 


In the previous section, we informally described the semantics of conditional 


equations using the auxiliary functions if-then-else. Here we formalize the discussion. To 


- 90 - 


check whether a conditional equation ‘e, = if b then ¢,’ holds in A, we extend A to include 
the interpretation of the -auxiliary function if-then-etse : Bool x D'x D' + D’ 
corresponding to every D’ € A’. - The interpretation fie. then-else in the extended algebra has 
the following behavior: 

fif-then-else(™ »¥,) 

fit-then-else( % »¥,) 
The interpretation of a conditional equation involving if-then-else can be verified to be 


a 
a, 


equivalent to interpreting the formula ‘b = T => (2, =e,)" as we require that b behave 
deterministically for every A-instance. Henceforth, we view a conditional equation as a 
formula ‘b => e, = e, so that we do not have to consider the auxiliary functions if-then-else. 

If a type algebra A is in FAS), then we say that A behaviorally satisfies S, and call 
A a model of the specification S. Note that A satisfies the axioms under the interpretation 
of the symbol ‘=" as the observable equivalence relation on the domains of a type algebra. 
If a model A of S satisfies the axioms interpreting ‘=’ as the identity relation as in Logic, we 
say that A identically satisfies Ss. ov : 

For example, the models A,, and A}, of the data type Set-Int discussed in 
Chapter 2 can be shown to be in ASet-Int). So, they are also the ‘models of the 
specification of Set-Int given in Figure 3. 1. Ag, identically satisfies the specification of 
Set-Int. It should be easy to see that every nice algebra in F{S) identically satisfies a 
specification S because the observable equivalence relations are the identity relations. 

Using the fact that the set E of observable equivalence relations on the domains in 


A above is a congruence, we have 
Thm. 3.1 A € AS) iffA/E€ AS). § 


So, to check whether a type algebra A is in F{S), we can check whether its reduced algebra 
A/E identically satisfies S. Using the above theorem, we:-get 


‘Thm. 3.2 If A € AS), then every type algebra behaviorally equivalent to A is in RS). 4 


-Q9} - 


3.2.2 Specifications with Auxiliary Functions 


An auxiliary function is not.a part of a data type, so.a model in F{S) cannot have 
any interpretation for.the auxiliary functions. We ‘first define an extended. gata: type:D, 
from D, whose operation set is @U A, and the-set of defining types: isa U A,. Hf the 
Auxiliary Functions component is included in the Operations component:in S, the madified 
specification S, is a specification of data types“having the same syntactic structure as D,, 
and S, does not use any auxiliary functions. We define A(S,) for the modified 
specification S, as discussed above. An algebra A, of type D, in AS,) is 

= LVjID Es UAF lee QU Ash eo 
So an auxiliary term can be interpreted in A.. ave an in S expressed using the 
auxiliary functions in A, holdin A,. me : 

For every algebra. A, of type D, in A AS p> we sik an algebra A of type D in 
FS) as follows: 

A= [{ Vip: [Dc a} tf bee ahh. 
where for each D' € A, V. = Vy, and V, © vi. A function f, isa restriction of f} to the 
domains of A such that V,, is the smallest set. closed under. finitely many-applications of the 
functions in {f,|o€ 2}. Vj).can be a proper.subset.of Vy: hecause S may. use. an 
auxiliary function having D as its range that constructs some extraneous values oe [29] for 
an example of, such a specification).° wea ttaate 

For example, the model A,,,..of the data type ¢ Stk Jnt-199 disceroes ‘in ne 2 
_can be shown to be in. AStk-Int), We must extend: A,,, to.include the, interpretation Si of 
the auxiliary function Size such that SK<i,, ...,1,>) 4 m, andouse the extended algebra 
for proving, that it satisfies.the axioms and restrictions in Figure 3.2.. 7 


6. Howeve we do not encourage specifications in which auxiliary functions are of result type D and 
generate values not constructible by the constructors of D. ° 


-9)- 


3.2.3 Semantics of a Specification _ - 


Using Theorem 3.2, we partition FS). using. the behavioral equivalence relation 
on type algebras,.and get the set D(S) of data types as ‘the semantics of S. A reduced 
algebra in every equivalesce class in the partition on AS) can serve as.a representative of 
the data type defined by the equivalence class. This can be: pictorially. expressed as 


where D,, ., D,, ..., are the data types in oe and A ee ..., are the 
models of a data type D. : 

It should be clear from the discussion ‘in’ the last: two subsections that the 
operations of different data types in D(S) share the behavior specified by S. However, they 
differ in Be eS ; a 

(i) the amount of nondeterminism they have, if specified to-be nondeterministic by S, 

Gi) their behavior on inputs not satisfying the preconditions specified by S, 

- ii) their behavior ‘on ‘inpats. satisfying thé ao and optional exception 
conditions specified by S, and . 

(iv) their behavior on inputs on ‘which their behavior is unintentionally omitted in S. 
If S specifies o to optionally signal on a subset of inputs, o for-different data types may or 
may not signal for some of the inputs in the subset. If the constructors are specified to 
optionally signal for expressing the size requirement on the values of a data type, different 
data types have different upper bounds on the size of their values. 

For example, D(Set-Int) defines different data types in which Choose ‘behaves 


differently because it has different amounts of hordeterminisin, 2s was discussed’ éatlier. 


- 93 - 


D(Stk-Int) has different data types whose operations Pop and Replace have different 
behavior on the empty stack, and the operation Push behaves differently on stacks of 
size > 100. Some of the data types differ in the maximum size allowed of the stacks. The 


data type Stk-Int-100 defined in Chapter 2 is in D(S). 


- 94- 


~—3.3 Specification of a Data Type and Equivalence of 
Specifications — | aa 


Def. 3.2 A specification S specifies a data type D iff. D € D(S) (.c.,.My C-AS)).’ a 


If a specification S specifies the data type D, the specification need not be precise in the 
sense that it may not completely specify the behavior of D; a portion of the behavior may 
not be, in fact, captured by S at all. There may be data types in D(S) different from D. We 
introduce the following stronger definition for specifications specifying deterministic 


operations only. 
Def. 3.3.1 S precisely specifies D iff D(S) = { D } (i.e., My = FAS). 1 


The above definition requires that the specification of a defining type D’ € A also precisely 
specifies D’. 

For a specification specifying nondeterministic operations, its semantics has data 
types differing in the amount of nondeterminism their operations have. nondeterminism 
allowed by S. We define a partial ordering on type algebras in F(S) which orders data 
types in D(S) based on the amount of nondeterminism in their operations that are specified 
to be nondeterministic by S. Instead of comparing two arbitrary type algebras in FS), it is 
convenient to compare algebras having the same domains but di ffering in their functions. 


Def. 3.4 Given two type algebras A and A’ of D- 

A= [{V,,1D'€a'}, EXV;{f,|oea}] 

A’ = [{Vp 1D‘ €4'}, EXV; {f' [oe a}], 
A' is at least as nondeterministlic as A, expressed as A Sad A’, if and only if 
for every operation o € Q, and for each View Vis 

{f(y.---. VJ ECU... uD FE | 


Informally, the above means that every function in A’ is at least as much nondeterministic 


7, Recall that Mp) is the set of all models of the data type D. 


-95 - 


as the corresponding function in A. We say that A Sa A’ ifand only ifA <4 A’ and there 
is at least one nondeterministic function f° in A’ such that for some Vioceea Ms 
{fh fiend v)} c{ f(y, ee v) $and { f(y, che vy} - {f'(v,,...5¥) }. 

We can order the reduced models in F{S) using < ol relation. 


Def. 3.5 A reduced model A in AS) has maximal amount of nondeterminism allowed byS 
if and only if there is no reduced model A’ in F(S) such that A < nd A: 8 


If a reduced algebra A € F(S) has maximal amount of nondeterminism allowed by S, then 
it can be shown that any algebra behaviorally equivalent to A also has maximal amount of 


nondeterminism allowed by S. Using this, we get 


Def. 3.6 A data type D € D(S) has maximal amount of nondeterminism allowed byS if its 


reduced model has maximal amount of nondeterminism allowed by S. § 


For example, the model A si has maximal amount of nondeterminism allowed by 
the specification of Set-Int in Figure3.1, so the data type Set-Int defined in Chapter 2 has 
maximal amount of nondeterminism allowed by the specification in Figure 3.1. It is easy to 
see that no model of the specification of Ny} in Figure 3.3 can have maximal amount of 


nondeterminism; given any model A, we can find a A‘ such that A <a A’ Z 


Def. 3.7 A specification S specifies unbounded nondeterminism if and only if D(S) is not 
empty and there does not exist a data type in D(S) with maximal amount of 


nondeterminism allowed by S. §- 


So, the specification of N) specifies unbounded nondeterminism because of the operation 
Pick. The specification of Set-Int specifies bounded nondeterminism as there are data 
types with maximal amount of nondeterminism allowed by the specification of Set-Int in 
D(Set-Int). In this thesis, we have considered data types with operations having only finite 
nondeterminism, so we are interested in specifications that specify bounded 
nondeterminism. Henceforth, we assume that a specification S does not specify 
unbounded nondeterminism. 


In case of a specification specifying nondeterministic operations, we have 


- 9 - 


Def. 3.3.2 S precisely specifies D if {D} = {D_, 1D. € D(S) and D, has maximal 
amount of nondeterminism allowed by S }. # 


The above definition also covers the case 3.3.1 above, as in case of a specification specifying 
only deterministic operations, the set {D... ! D. € D(S)} is the same as D(S). For 
example, the specification in Figure 3.1 precisely specifies the data type Set-Int defined in 
Chapter 2, whereas the specification in Figure 3.2 does not precisely specify the data type 
Stk-Int-100 defined in Chapter 2. 

We can also show that a specification S is correct W.t. t. a model A by oaing 
that A € FXS). 
We can define equivalence among specifications as follows: 


Def. 38 ‘Two specifications Ss, and S, are “equivalent, ‘expressed as S,= = S,. iff 
D{S,) = D(S,) (ie. RS,) = FS). 8 a 


Note that we do not make any distinction between a specification i in which the 
constructors are ‘completely’ specifi ied and another specification in which some of the 
properties of the constructors are not specified. For example, the ‘specification of Set-Int. 
does not specify the property of Insert that the order i in which integers are inserted does not 
matter. The specification in Figure 3.1 is equivalent to the new specification obtained by 
adding the following axiom because both have the same semantics: - . 

- Insert(Insert(s, il), i2) = if 1 i2:then Insert(s; it) ele: Insert(Insert(s, 12), il). 
However, as we discuss in Chapter 4, it is possible to prové nsore properties about Set-Int 
using the specification with the above axiom than the specification given in Figure 3.1. We 
distinguish between the two specifications there, and define a stronger equivalence relation 
on specifications which incorporates this distinction. 

We have discussed above one way of precisely specifying ; a data type. D. As stated 
in the beginning : of this as D can be presented i in many ways.* One way is to present 


8. We have deliberatcly used the ward ‘presented’ instead of. ‘specified’ 4oavoid confusion, as we have 
precisely characterized above when a data type can be specified. 


- 97 - 


a representative model A and define the semantics of such a presentation to be { A’| A’ is 
behaviorally equivalent to A }, as in [3]. There could be other ways of presenting data 
types. If the semantics of these methods can be given in terms of type algebras using our 
formalism, we can relate specifications given using different methods (see discussion in 


Section 3.6). 


3.4 Specification of Bool 


In Chapter 2, we defined the data type Beol which serves as the basis of our 
formalism. Figure 3.4 contains a specification of Bool; this specification cannot be 
expressed in the proposed specification language because it has an inequality . 

T# F 
as an axiom. This axiom is introduced to capture the property that the boolean constants T 
and F are distinguishable from each other. The semantics of the specification is the data 
type Bool; it can be verified that every axiom in the specification holds in a model of Bool. 
Because of the inequality, we do not need to introduce inequalities in the specifications of 
other data types; we will show in the next chapter (Subsection 4.2.3) how to deduce them 
using the above inequality. The specification of Bool is assumed to be given. 


Figure 3.4. Specification of Bool 


Operations 

T : — Bool 

F : — Bool 

not : Bool — Bool as“x 

or : Bool X Bool — Bool as x,V xXx, 
and : Bool X Bool — Bool as X, AX, 
implies : Bool X Bool — Bool aS X, => X, 
- eqv : Bool X Bool —+ Bool ° as X, = X, 
_ Axioms 

Ta F 

~T =F 

~F =T 

xVy = yVx 

xVT=sT 

FVF = F 


xAy = ~ ((~ x) V (© y)) 
(x=> y) = (~x) Vy 
xery = (x => y) Aly => x) 


- 99 - 


3.5 Properties of a Specification 


We igiee two properties of a specification, namely consistency and behavioral 
completeness, based on its semantics. These properties are different from the consistency 
and sufficient completeness properties defined ‘by Guttag and Horning [28], which are 
proof theoretic (i. e., based on What can’be deducéd from a specification). We discuss the 
relationships between the properties introduced i in this section and the € properties defined 
- by Guttag and Horning if the next chapter. . 

‘Consisténcy and behavioral completenéss are both structural properties; they 
ensure proper Telationships among different components of a specification. Generally 
speaking, consistency means that a property assumed already is not invalidated. In this 
case, it means that properties expressed in the specification of a defining type or an 
auxiliary type, or the assumptions made about the v way the: “exceptional behavior of the 
operations be specified; are not invalidated. Ite ensures that a a specification spetifies at least 
one data type. 

. Behavioral Roomnieiencs caphates the intuition that a 2 specification karen 


iY og bg es Ghesgs 


a 


the operation behavior unspecified +) esd? jel and Sptional exception 
. conditions with the ‘operations. Apart from intentional completeness a 1 specification may 
be iricompleté Becalise’ the designer unintentionally ‘omitted’ sbine é axioms. “The: behavioral 
completeness property ehsurés that ‘a spécification’ fo i intentionally incomplete. Bo, it 
warns against any omission. It is a desirable property for most.of the specifications. .... 

We first discuss the consistency property; later, We discuss th t behavioral 
completeness property. 


wed it 


3.5.1 Consistency ee a a 


A specification S is, informally speaking, inconsistent 
(i) if S specifies ground terms of a defining type (or an auxiliary type) that are sea 
to be distinguishable by its specification, to be ia cae of: ae a 


- 100 - 


(ii) if S specifies ground terms of a defining type for ‘an auxiliary type) that are specified 
to be posenanly equivalent by its PEM; to La cereus: 


An example of the first case would be a specification S1 using, the specification of Bool and 
specifying T and F to be observably ‘equivalent. An example of the second case ds. the 
specification of EX] given in Figure 3. 5. The data type | EX! -has only one value. The 
predicate P distinguishes among observably equivalent ground terms of Set-Int ; P returns 
T if and only if in its set argument, an integer has been inserted more. than. once; otherwise, 
it returns F. This property of the set values is not observable. by the operations of Set-Int as 
specified in Figure 3.1. 

In either case, S does not have any models, i.e., AS). = 9. In the first case, no 
pe algebra can satisfy S because one of the axioms would want two distinguishable values 
in the domain of D’ to be observably equivalent. In the second case, S does not have, any 

models because of the well formedness Property : of a type. algebra (which. i is that the set of 
observable eq uivalence relations i isa congruence). . 
EXl cannot be implemented in. any _programming., language in which an 
implementation ofa data type is hierarchically structured and the Tepresentation of a-data 
_ type is hidden from the users of the data type, since only. the external behavior of Set-Int 
can be observed. Thus the predicate P cannot, be. implemented because the 
implementation of P must distinguish ‘between, for example, the. -observably . equivalent 
ground terms Insert(Insert(®, 0), 0) and Insert(2, 0). Polajnar. [e711 has also discussed such a 


violation by a specification S of the specifications of the defining types... He said such a 


Figure 35. Specification of EX1_ 


Operations 


a : — EX1 
P : EX1X Set-int — Bool 


Axioms 


- Pla, @) =F 
P(a, insert(s, i) = ifi€ s then F else Pla, a) 


- 101 - 


specification had protection errors. . 

A specification can also. be inconsistent because the exceptional behavior of the 
operations is not properly specified, for example, the boolean’ conditions in exception 
condition restrictions may not be disjoint. — . 


Def. 3.9 A specification S is consistent if and only if () the specification S' of D’, for each 
D'€AU A. is consistent, and (ii) D(S) is not the empty set. § 


A specification S defines observable equivalence relutions of ground terms just. 
like a data type does. By a term here, we mean a term constructed without using auxiliary 


fu nections. 


Def. 3.10 S specifies two ground terms e, and e, of type D’ € A’ to be observably equivalent 
(or e, and e, are observably equivalent byS) iff-e, and: e, are observably equivalent in every 
_ data type in B(S) (i.¢., the possible interpretations of ¢, in a model A € FAS) are observably 
equivalent.to the possible interpretations of e,in A). 8 


Def. 3.11 S specifies e, and e, to be distinguishable iff e, and e, are distinguishable in every 
data type in D{S) (ie.,.the possible. interpretations. of e, in a:model A in F{S) are 
distinguishable from the possible interpretations of e, in A). & 


For example, Insert(Insert(@, 1), 1) and Insert(#, 1) are specified by the specification of 
Set-Int to be observably equivatent. Inseri(#, 1) arid Insert(#, 2) are distinguishable. 
However the specification in Figure 3.2 does not’ specify’ Pop(Nulf)’ and Null’ to be 
observably equivalent or distinguishable. ‘If 5 is inconsistent, there are ground terms which 
are both obsérvably equivalent as well as distinguishable by S; because #{S) is the empty 
ee i 
Since a specification S may leave the behavior of operations unspecified on 
certain inputs using the precondition and/or optional exception testrictions, there may in 
general exist ground terms of type D’ € A’ which are neither specified by S to be observably 
equivalent nor distinguishable. For example, Pop{Null) is rieither observably ‘equivalent to’ 
Null nor distinguishable from Null by the specification of Stk-Int-in Figure 3.2, as a data 
type in D(Stk-Int) may have Pop return the empty stack: itself when itivoked on the empty 


- 192- 


stack and another data type in D(S) may have Pop signal-on the empty stack. Ground 
‘tems involving nondeterministic operations may also be neither-observably equivalent nor 
distinguishable by S; for-example, the ground term ‘CHeose(Insert(Insert(Null, 1), 3)) is 
neither observably equivalent nor distinguishable -fromy 3. ‘Fhe .above -observable 
equivalence and distinguishability relations capture the common. behavior of data types in 


DS). 
3.5.2 Behavioral Completeness . 


In the definition of behavioral completeness, we must capture the intentional 
incompleteness of a specification. If a specification S associates a nontrivial precondition 
with: an operation, different data types in D(S) can have such an operation behaving » 
differently on an input not satisfying the precondition. If-an‘operation is specified to have 
an option to signal when its input satisfies a condition, different’ data types in D(S) can have 
such an operation signalling the specified exception or' terminating normally on an input 
satisfying the associated condition. If S specifies a nondeterministic operation, different 
data types in O(S) can have such an operation having as much’ nondeterminism as desired. 
This incompleteness in S is intentional. Any othér difference in the behavior of data types 
in D(S) is unintentional. oe ee ; 

The above means that fora specification S to be behaviorally complete, data types 
in D(S) having maximal amount of nondeterminism. allowed by S must have the same 
observable behavior on intended inputs, except that if there.is an: optional exception 
condition specified for an operation, then the operation has the option of signalling or 
terminating normally on an input satisfying the boolean condition in the optional exception 
condition. 

We define three relations on the models in RS). The partial isomorphic 
equivalence relation formalizes the intentional incompleteness introduced due to. the. 

‘nontrivial preconditions specified for the operations in S. The isomorphic embeddability 
relation formalizes the intentional incompleteness due to the operations specified to have 
the option to signal exceptions. Later we combine them to define the partial isomorphic 
embeddability on reduced models in F{S). We use, the partial isomorphic embeddability 


- 103 - 


relation to define the behavioral completeness of a specification by-relating the reduced 
models of data types in D(S) having the maximum amount of nondeterminism allowed by 


the specification S. 
3.5.2.1 Partial lsomorphic Equivalence 


Let P, bea precondition specified for o in S. ‘Let S' be the specifi ication of a 
defining type D'€ A in S. The partial isomorphic equivalence relation relates models 
whose operations have the same behavior on inputs satisfying their preconditions. - The 
definition is obtained by modifying the definition of isomorphic equivalence (Def. 2.13) 
given in Chapter 2. As in Chapter 2, we assume that -the domains correspofiding to each 
D’€ A in models A, and A, are defined by the isomorphically equivalent models in FAS’) 
and that the ae: equivalence relation on these models: in FUS’) induces a bijection 
oy: :Vyh- Vo" : 


Def. 3.12 Given two algebras A, and A, in FS) _ 
A, =L{ V1 D'€a'}, EXV, : {PJcea}] 
A, =[{V),/1D'€ 4'}, EXV,; {P| o€o}] | 
such that for each D' € A, Vp’ and v2, are the value sets defined by somorphically 
equivalent models A) and A, in RS’), where S' is a specification of D’ ,and %y:V p Vp’ 
is a bijection ener due to the isomorphic equivalence of A, and A,, A, and A, are 
isomorphically equivalent Wert. {P. |o € 2} (or Wat. s) ifr there are bijections 
$y Vp — Vp and @pyy : EXV, + EXV, such that @ = {4% [Die a’ FU { yy } has 
the following properties: : 
~ i) For each ex: D, x... x D_, and for wy vs of type D,.. ay of type D,, 
Ppyy (EH ns ¥)) = Exp (0) - %y (,) and 
(ii) for eachs € 9,6: D,X. XD, 7D, oe 
for every v, of type D,.- .v of type D. ifP, (y stig v)= T, then 
(a) if neither i nor f 2 signals, then 
{Oo (0 (v,....¥))} = {h@p OD. a oy ()) }; otherwise, 
(b) ory yli(v ry WV) = Lp ©). aree My (YD). | 


- 104- 


We also call A, and A, partially isomorphically- er when { P, |o € 2 } is evident 
from the context. . a 23, a 

The reason for requiring #,, to be a bijection (and not a partial one to one 
function) is the assumption that for the case when a constructor is specified to have a 
nontrivial precondition, if it terminates normalty on an‘ifiput Hot satisfying ‘its precondition, 
the value returned can Ne constructed by the constructors using inputs saelying their 
preconditions. 


3.5.2.2 lsomorphic Embeddability 


_ In the definition of isomorphic embeddability relation,. we want to capture the | 
intuition that. if a specification S associates an optional exception condition with an 
operation o, then.on an input X satisfying the associated. boalean condition O(X), the 
function corresponding to o either behaves the same in different algebras in-FS) (ie., it 
either returns the ‘same’ value or signals the ‘same’ ‘exception value), or the function 
behavior differs in different algebras to the extent that i in one algebra, the function signals 
the desired exception value and in the other, the function returns the desired normal value. 
The condition (iii) in the definition below captures this. oe 

. If any constructor o Is specified to optionally signal, then the value set of D 
defined by one algebra i in RS) may be a subset of the value set of D defined by another. 
algebra i in FS). (in fact, one value set may have a value that i is distinguishable from every 
value i in the other value set.) That is why i in the definition below, we do not require the 
mapping relating the value sets of D in two algebras to bea bijection; instead, it is required 
to be a one to one partial function. ? However, the mapping must be defi ned for every 
value constructed by the function corresponding to a constructor o using inputs which 
_ Satisfy the associated precondition ‘and do: fot satisfy any ‘boolean condition: stated ina 

-required exception condition or an optional exception condition specified for o. ‘This. 


constraint is captured in the condition (i) below. 


9. That is also the reason for calling the relation isomorphically: embeddable. 


- 105 - 


Def. 3.13 Given two algebras A, and A, in AS) satisfying the requirement about the 
domain | corresponding to D' € A stated in Def. 3. 12, A, is isomorphically embeddable in.A, 
w.r.t. S iff there exist 1-1 partial functions Va + Vb and yyy : EXV, - EXV, , with 
the following properties: — - oe 

(i) for every set of values Vig wena Vs for a constructor o, if 

(a) P Lx,/ Vv, --».x,/v] holds, 

(b) for every required exception condition specified for o, its boolean condition 
R{x,/y,, ..-, x,/v.] does not hoid, and es - 
, a for every optional exception condition specified for o, its boolean condition 
Ox, /v,, ««+,X,/v ] does not hold, . . 

then ®,, is defined for every value fi(v,, ..., io 

— (ii) for every exception name ex: D, X...xXD_, 

Ppyy (ex(y,. oY = ENG AV). on » % rm) if *of) is defined. for each 
1<i<m, and : : 
(iii) for each o € Q, for every set of values Visweee Ve such that poet) is defined for each 
l<i<n, ; | in Vig Postale g oe were 
(a) if on v,, ..., v, f signals an exception value ex(v;, ..., v') specified to be 
optional. by S, then the associated a a ene ie on v,,...,¥, and 


some v, or . 
(b) if %p (v1), ep: +) are defined’ and f° signals an - exception vahie 


exe (vy), --- 1%: (v! y specified to be optional’ by S on input *o,' (%) . »%) Cs 
then the associated ‘condition Ofx,-- 78s ‘holds “on "dy ). - .®p nd "and | 
mC ne v ) either signals oxy’, .. ssh Ve JIOE returns v; otherwise, | 
O{hO,. b= Pee, ps 1% (y Di a 7 
For example, let us modify the model Ay roe in Subsection 2.3.2 so that 
the function corresponding to Push signals ietion if sequence size is 128, instead of 100, 
and call the modified model Ask: It can be shown that A,,, is isomorphically 


embeddable in La . A! 


stk S ‘bigger’ than A,,, because the value set corresponding to 


= 106 - 


Stk has more elements in A stk (an in A,,,. When optional exception conditions for 
constructors are specified to state a least upper bound on the size of the values of the data 
type, as in case of the specification of Stk-Int in Figure 3 2, different algebras i in FS) may 
have different upper bounds on the size of the values i in their value sets. 


3.5.2.3 Partial lsomorphic Embeddability 


We combine the notions of partial isomorphic equivalence and isomorphic 
embeddability to define another relation, The.new. relation captures both kinds of 
intentional incompleteness, due to preconditions as well as due to optional exception 


conditions. 


Def. 3.14 A, is partially isomorphically embeddable w.r.t. S in A, if and only if there exists 
a model A’ in F{S) such.that A‘ .is partially isomorphically' equivalent to A, and A’ is 
isomorphically embeddable in A, . 8 . 


3.5.2.4 Definition of Behavioral Completeness 


- We define behavioral completeness of a specification by relating the reduced 
models of the data types having maximal amount of nondeterminism allowed by S in D(S) 
using the partial isomorphic embeddability relation. The definition of behavioral 
completeness. is a single level definition in the sense. that a specification S can be 
behaviorally complete irrespective. of whether a. specification ofa defining type in S is 
behaviorally complete. If the specification of a defining type. is, behaviorally incomplete, . 
the incompleteness will be reflected in the semantics of a behaviorally complete S. So, in 
the definition, we consider only reduced models in RS) that have the domains 
corresponding to each D’ € A defined by ‘the isomorphicaly savant models in RS’), 
nets Sea reeieaen De 


- 107 - 


Def. 3.15 A specification S is behaviorally complete iff (i) S is inconsistent, or (ii) for any 
two reduced models A, and A, in AS) having maximum amount of nondéterminism 
allowed by S and whose domains .corresponding to each D'€ A. are defined by the 
isomorphically equivalent models in F(S'), where S' is a specification of D’, A, is partially 
isomorphically embeddable in A, or vice versa. & i 


The reasons for having the first case this way in the above definition are that for 
an inconsistent S, AS) = @, so any relation aimiong algebras in FAS) holds, and that we 
want our definitions to be compatible with the definitions of consistency and compleienes. 
in logic, in which an inconsistent theory is complete: . 

For examples, the specifications of Set-int. ‘Stk-Int, and Bool in Figures 3.1, 3.2, 
and 3.4 respectively can be shown to be behaviorally complete. ‘Note that any specifi cation 
not specifying any observers i is trivially behaviorally complete. We tan h show the following: 


Thm. 3.3 For a specification S specifying only deterministic sessiatigne Ge not specifying 
any precondition or an optional exception condition for an operation, a consistent S is 
behaviorally complete iff S precisely specifies a data type D assuming that the specification 
S' of every D' € A precisely specifies D'. 

Proof The above definition of behavioral completeness reduces under the stated 
conditions to requiring that the reduced models in F(S) are isomorphically equivalent !® 
| This means that AS) = My 

Hence the theorem. 8 


The behavioral completeness property guarantees that the behavior of the 
operations has not been left unintentionally unspecified. However, there are situations 
when the behavioral completeness requirement on specifications is restrictive [31, 51]. For 
example, consider a modified version of the specification of Set-Int in Figure 3.1 in which 


Choose is not specified to nondeterministic. In such a specification also, we do not wish to 


10. If a specification docs not specify a nontrivial precondition for an operation and also does not specify any 
optional exception condition, the partial isomorphic cmbcddability relation reduces to isomorphic 
equivalence. 


- 108 - 


commit to the valtte Choose may return. on an nonempty set, | so the axiom specifying 
Choose is still 
Choose(s) € s 2 T. 

This specification is not behaviorally complete. We would want such a specification to be 
behaviorally incomplete, as otherwise Cheese must be completely specified. The 
behavioral completeness requirement is restrictive in such a case because the reduced 
algebras in the semantics of the modified specification are not isomorphically equivalent. 
For example, in one reduced algebra, the function corresponding to Choose when applied 
on {], 3} may return 1, while in another reduced algebra, the corresponding function may 


return 3. For most specifications. specifying nondeterministic operations, if we modify such 


a specification so that an operation specified originally to be nondeterministic is instead 
specified to be deterministic, then we would often. want the modified specification to be 
behaviorally incomplete. 


- 109- 


3.6 Comparison With Related Works 


We compare our specification language with those of Guttag et al. [29] with 
extensions proposed in [31], Zilles (771, the ADJ group [22, 23], Burstall and Goguen [7], 
Goguen and Tardo [21], and Nakajima et al. [62]. We first discuss the capabilities of these 
specification languages and the approach used to give their semantics. Later, we compare 
the semantics of a specification in these languages. 

Zilles [77] and ADJ [23] do not allow auxiliary functions in a specification, SO their 
languages have a limited expressive power. Zilles [77] assumes that the operations of a data 
type are deterministic and that they do not signal exceptions. The ADJ [23] do not allow 
nondeterministic operations either; they adopt the simpler approach discussed in 
Subsection 2.3.3 for modeling exceptions, and discuss a specification language embodying 
this approach. Goguen [20] extended the ADJ method of modeling exceptions, which we 
compared with our approach in Subsection 2.3.2. His approach for specifying exceptional 
behavior of the operations is different from our approach; it is motivated by the view that 
exception values are like normal values (and so they are typed). The exceptional behavior 
of the operations is specified using equations. Our language is richer than his language 
because of the preconditions and the distinction made between optional exception 
conditions and required exception conditions. His semantics of the specification method is 
complex. , 
Burstall-and Goguen’s [7] CLEAR language and its extension, the OBJ language, 
support hierarchical structure and modularity like our language. However, Burstall and 
Goguen ‘have ambitious goals; they are attempting to develop a general purpose 
specification. language based on algebraic semantics .in which the semantics of a 
programming language can be specified. So they are forced to introduce complex 
mechanisms, for instance, procedures operating on theories, which make the specification 
language hard to understand. The category-theoretic semantics of their language is also 
‘complex [30]. Our approach instead has been to concentrate on the data component of 
programs, and develop a specification language afd a formalism for data types. Our 
semantic method is simpler. 

Guttag et al.’s work [29] is the closest to our work. Their language is limited as it 


- 110 - 


cannot specify data types with nondeterministic operations. As was said in Section 3.1, our 
specification language is an enrichment of the specification language in [31]. Our 
formalism can provide a semantics for their specification language. Our formalism can also 
be used to provide a mathematical basis of the AFFIRM system (60, 61]. In this sense, our 
formalism places their work on a firm basis. 

Nakajima et al. [62] specify a data type, as discussed in Chapter 1, as a first order 
theory. Their method differs from other methods including our method because they allow 
any first order formula to be an axiom in a specification. Auxiliary functions are not 
allowed in a specification. Operations are assumed to be deterministic: they do not signal 
exceptions. We have not yet seen the semantics of their specification language. ‘If we 
assume that a first order theory is interpreted in a standard way as in Logic [16], the 
problems with this approach are discussed in the related work section of the first chapter. 
We further comment on their specification method in the next chapter from the point of 
view of deducing properties from a specification. 

Burstall and Goguen, Nakajima et al., and Guttag [31] can specify a type scheme 
(also called a parameterized type) 7 in their languages. ‘Recently, the ADJ group [71] has 
given a category theoretic semantics of a parameterized type. Our specification language, 
as it is, cannot express a parameterized type. However it should be evident from the 
discussion that our formalism as well as specification language can be easily extended to 
parameterized types. We discuss these extensions in the last chapter of the thesis. 

There are differences between our semantics of a specification, and those of 
Zilles, the ADJ group, and Guttag et al. [28], which are motivated by different definitions 
of a data type used in various formalisms. Zilles and the ADJ assume that values not. 
specified to be related by the axioms are different, even if they are observably equivalent. 
Guttag et al. on the contrary assume that the values are equivalent unless specified to be 
different. We have taken a different approach; we consider the axioms as specifying the 
observably equivalence relation. Our approach towards the semantics of a specification is 
similar to the one adopted in logic; we consider all models of the axioms to be the 
semantics of the specification. (Of course, we consider only the algebras satisfying the 
minimality property for modeling data types, and rule out nonstandard models.) Our 


-lll- 


semantics thus subsumes Zilles’s and the ADJ's definitions, as well as Guttag et al.’s 
definition in the following way. 

To understand the semantics of a specification in the ADJ group formalism as 
well-as in Zilles’s formalism, we introduce the following definition. As is stated ‘in 
Subsection 2.2.6, the models in FYS) can be partially ordered using the onto 
homomorphism relation, i.e., A, <A, ifand only if A, is a homomorphic i image of A, . 


Def. 3.16 A model A in RS) is called initial if A is a maximal model with respect to the 


homomorphism relation, and A identically satisfies S. | 


In an initial model A, Vy for each D‘ € A is a valtie'set defined by an initial model in F{S’), 
where S' is a specification of D’. Two: members inV,, ate not the same untess they are 
related by the axioms and restrictions. The ADJ group and Zilles define the semantics of a 
specification S to be the set of initial models in F{S). Guttag et al.; on the other hand, 
define the semantics of a specification S to be the set of reduced models in‘F{S). 


- 2 - 


4. Deductive System 


In this chapter, we develop a deductive system for abstract data types. The 
deductive system embodies general properties of data types which are not explicitly. stated 
in a specification but assumed in the semantics of the specification language. We construct 
a theory of a data type, which is a collection of properties of the. data type, from its 
specification. The theory of a data type can be used in reasoning about programs and 
designs that use the data type in the same way as the properties of natural numbers are used 
in reasoning about programs operating on natural numbers. In. particular; the correctness 
proof of an implementation of a data type with respect to its specification as discussed in 
the next chapter, involves the use of the theories of its defining types and the theory of its 
rep, the data type whose values are used to represent the ‘values of D in the 
implementation. We can pose questions about the behavior of a data type and check 
whether they can be answered. fram its specification according to our intentions using the 
deductive system. In this sense, constructing the theory of a data type can enhance our 
confidence in its specification. 

The construction of the theory of a data type from its specification has an 
important advantage that the theory does not depend on any particular implementation of 
the data type. The correctness criterion used for implementations in Chapter 5 guarantees 
that every property.in the theory is satisfied by every correct implementation. We can thus. 
reason about programs using a data type abstractly without referring to any particular 
implementation of the data type. This separation between the theory of a data type and its 
implementations via the specification factors the proof process in to two independent parts: 
(i) Proof of use of a data type, and (ii) proof of correctness of implementation of a data type 
[37]. In this chapter, we discuss the first part; we discuss the second part in the next 
chapter. | : 

The theory of a data type is constructed hierarchically from its specification, using 
the theories of the types used in the specification, just like the specification of a data type is 
designed. The design of our specification language has been influenced by the goal that a 
specification should not have to state more than what is required and that it be structured 


- 113 - 


in the sense that different components of the data type behavior are separately specified. 
To construct the theory of a data type from its specification, we combine these components. 
For instance; as is discussed in the previous chapter, an axiom. in the axioms component has 
a restricted interpretation: A variable of type D’ in the axiom cannot be freely substituted; 
instead, the substitution should be such that the input to every operation symbol satisfies its 
precondition as specified by the restrictions component, and no. operation invocation 
should signal. We first construct the unrestricted axioms-from. the restricted axioms in. the 
axioms component of a specification.using the restrictions; these unrestricted axioms are 
“used to construct the theory. Henceforth, we refer to a (restricted) axiom. in the axioms 
component of a specification. as a formula and to an unrestricted.axiom as an axiom to 
avoid confusion. eo eek | 

The proposed deductive system is s used to.preve properties. usin We have 
not investigated the possibilities of automating the .deductive-system,, but we. relate our 
work to. Musser’s work [60, 61] on automating the. proof: ‘theory. of. data types from their 
algebraic specifications. fete ; 

_ Instead of discussing. the complete seaucives stain ere ‘die construction of a 
theory from -& Specification specifying nondeterministic, operations .and operations 
exhibiting exceptional behavior in a single shot, we do so. step-by: step., We first discuss the 
theory of a data type with deterministic operations and without considering their 
exceptional behavior. We then incorporate the exceptional behavior of data types into 
their theory. Finally, we discuss data types with nondeterministic operations to exhibit the 
extra machinery needed for introducing nondeterminism. 

For specifications specifying only deterministic operations, we discuss various 
subtheories, namely, the equational subtheory, distinguishability subtheory, inductive 
-subtheory, constructed using different fragments of the deductive system. We define three 
structural properties of a specification, namely, sufficient completeness, well definedness, 
and completeness. Checking for these properties for a specification ts a step towards 
ensuring the correctness of the specification. We precisely state the sufficient completeness 
property defined by Guttag and Horning [28] for a restricted set of specifications and 
extend it to specifications in our specification language. This property requires that the 


-114- 


behavior of the observers on their intended inputs can be completely determined from the 


‘specification by purely equational reasoning. We relate this property to the behavioral 


completeness property discussed in the previous chapter, ‘which is model theoretic and 
which requires that the specification completely specify the behavior of the observers on 
intended inputs. Recall that the behavioral completeness property does not say anything 
about what can be deduced from the specification. We show that sufficient completeness is 
stronger than behavioral completeness. 

The completeness property is even stronger than the sufficient completeness 
property, since in addition to the requirement that the: behavior of the observers can be 
deduced on any intended input by equational reasoning, it also requires that the 
equivalence of the observable effect of the constructors on intended inputs can be deduced 
from the specification by equational reasoning. % 

The welt definedness property constrains that a specification be modular in the 
sense that it preserve the specifications of defining types and auxiliary one in it. This 
property is stronger than the consistency property. 

In the last section, we define a stronger equivalence on en than the 
equivalence defined in Section 3.3. The stronger equivatence of specifications requires that 
not only the two specifications have the same semantics, but their theorfés must also be the 
same, 


-115- 


4.1 Preliminaries 


A data type can have many different but equivalent specifications (see Section 3.3 

and Section 4.5). These specifications may differ because 

(i) they may specify the properties of constructors to different extents, 

(ii) the properties of the operations are specified in different ways, and 

(iii) they may use different sets of auxiliary functions. 
Theories constructed from different equivalent specifications can be different, as will be 
clear from the following discussion. Unless stated otherwise, we assume that a data type 
~ has a single fixed specification; in the last section of the chapter, we discuss theories 
constructed from different but equivalent specifications of a data type. 

If a specification S specifies only a single data type D, then the theory constructed 
from S is the theory of D. If S specifies a set of related data types, then the theory 
constructed from S is the theory of the set of related data types. The theory constructed 
from S consists of properties characterizing the behavior of the algebras in RS), the 
semantics of S. Let Th(S) stand for the theory constructed from S. 

The deductive system uses multi-sorted (or many sorted) first order predicate 
calculus with identity [16] as the underlying logic. Though a first order theory cannot 
completely characterize the ‘infinite’ models in F(S), we prefer first order logic over second 
order logic because of the following reasons: 

(i) First order logic is well studied, and is better understood than second order logic, 

(ii) most of the programming logics developed for reasoning about the control structures 
of programming languages are first order, 

(iti) the recent work of Cartwright and McCarthy [8] has established that even the 
termination proof, which was believed to employ second order reasoning, can be 
adequately done in first order logic, a ; 

(iv) most of the work in automatic verification uses first order logic as the underlying 
basis, and 

(v) we believe that the most of the interesting properties of programs can be expressed in 
first order logic. 


Multi-sorted logic is more convenient than single-sorted logic as it avoids the use of type 


-116- 


predicates, which must be introduced in a single-sorted logic to differentiate among terms 
of different types. We use an induction rule having infi nitely many premises which is some 
what unusual; the proofs using this rule are infinitary. We interpret the formulas in Th(S) 
in the algebras in F{S); we do not consider uncountable structures because they are not 
type algebras and so they are of no interest. 

As was discussed in the previous ‘chapter, a formula is interpreted in a type 
. algebra in the same way as a formula in a structure in Logic [16], except that the symbol = 
is interpreted as the observable equivalence relation (see the definition in Sections 2.2 and 
2.3) on a domain instead of the identity relation. Because the observable equivalence - 
relation is an equivalence relation and is preserved by every function in a type algebra, the 
standard rules for identity hold (ie., the rules for identity are sound under this 
interpretation). . a = 

We now discuss the structure of formulas expressing properties of the models in 
FS). Following Enderton [16], we define the language of Tw(S) as the set of bear! 
symbols; the nonlogical symbols are used with the logical symbols to construct formulas. ' | 
Let L{S) stand for the language of THS). Instead of defining the complete language of 
TH(S) here, we introduce it incrementally. We discuss here LS) for a specification neither 
specifying nondeterministic operations nor the exceptional betiavior of the operations. ; 
US) includes the operation symbols of D specified by S as well as the auxiliary function 
symbols used in S. Since TH(S) i is constructed using the theories of the defining types and 
the theories of the auxiliary types used in S, Us) in includes Us), where > isa specification 
of a data type D’, for each D'€ AVA, 

In Section 4.3 on specifications specifying exceptional behavior of the operations, 
we inchide the exception names in US). “In Section 44 on specifications specifying 
nondeterministic operations, L{S) includes additional’ symbols needed fot expressing | 


1. A symbol (or an axiom or a rule of inference) is called nonlogical if it is specific to a particular domain of 
discourse whose theory is being constructed. ‘This is in cogtrast i, ;ogica] symbols, which arc determined by 
the underlying logic used to develop the theory. For insiancc, a logical axiom characterizes the logical 
reasoning available in the underlying logic, whercas a nonlogical axiom characterizes a Propurty about the. 
domain of discourse. 


-117- 


properties about seideenniniak operations. 

Terms of various types can be constructed using the symbols in L(S) and spies 
of various types as discussed in the. previous chapter. An atamic formula is.an-equation of 
the form ‘e, =.¢,', where e, and e, are terms of the same type. Compound formulas are 
constructed from atomic formulas using the standard rules of construction for first order 
predicate calculus with the help of logical symbols. .... 

We consider a boolean term as.a@ term; gather than an atomic formula; in this 
sense, we adopt a uniform view about the. symbols in L{S); considering each as.a function 
symbol. This view. is especially convenient when we jncoxposate: the exceptional behavior 
of the operations. . In case we use a, boolean term .6.as a formula, 2 is-considered as the 
abbreviation for the equation ‘b = T.’ : 

Recall that ‘e, = if b then e,” is an abbreviation for ‘e, = Alhientieahe »» €,)’ and 
‘e, = if then e, else e,’ stands for the following two conditional equations 
‘e, = if bthene,,’ 

“e, = if~ bthene,.’ 


In the simple case when exceptional behavior is not considered, ¢4= if b then e, is 


it 


equivalent to ‘(b= T)=> (e, = é,)." When we incorporate exceptional behavior, the above 
equivalence does not always hold, because 5 could possibly ‘signal an exception. However, 
if bis guaranteed not to signal, then the above equivalence holds in that case also. 

We use the abbreviation ‘e, # e, for the formula ‘~ (Vv Dieses x) [ é, am é, , 
where x,,..., x, are the only variables in e, and e,. Note that if e, and e, are ground terms, 
then ‘e, # e,” is equivalent to ‘~ (e, = é,). In fact, it is easy to see that 

(v x, ee a e=e]=>(e# é,). 

Only a subset of Th(S) is useful in reasoning about programs and designs using D. 
This subset consists of formulas in Th(S) expressed using only the operation symbols. 
Formulas expressed using auxiliary functions are not directly useful because the auxiliary 
functions are not available to the users of the data type(s) being specified, but these 
formulas help in proving formulas without auxiliary functions. The correctness criterion 
for implementations with respect to a specification S discussed in the next chapter does not 


require a correct implementation to include implementations of auxiliary functions used in 


- -18- 


S. Even if an auxiliary function is implemented, it is not available to the users of a data 
type. oo 

Let L(D) stand for the language of a data type D, which is a subset of L{S) 
consisting only of the operation symbols. L{S) - L(D) is then the set of auxiliary functions 
used in specifications of various data types. Let Th(D) stand for the subsct of TH(S) 
consisting of formulas in Th(S) expressed ‘using the nonlogical symbols in L{D). We are 
primarily interested in formulas in Th(D). The correctness criterion used in the next 
chapter ensures that Th(D) holds*for all correct:implementations with respect to S. Th(D). 
_ serves as the interface between programs using D and the correct implementations of D. 
Note that Th(D) does not include those nonlogical axioms of Th(S) which are expressed 
using auxiliary functions. | 


- 119- 


4.2 Theory of Data Types without Nondeterminism and without 
Exceptional Behavior . 


We start with the simple case of specifications that do not specify 
nondeterministic operations and the exceptional behayior of the operations. | The 
restrictions component of such a specification may specify the nontrivial preconditions for 
the operations. For illustration, we ‘modify the data type. Set-Int so that Choose is 
deterministic; let Set-Int’ stand for the modified Set-Int. The specification of Set-Int’ is 
given in Figure 4.1, which is obtained by modifying. the specification of Set-Int given in 
Figure 3.1. The syntactic specification of the operation c hoose does not have.the identifier 
nondeterministic. Instead of the required exception condition for Choose on the empty set, 
we specify ‘~ #(s) = 0 as its precondition in the restriction component of the specification 
of Set-Int’. | 7 ds Buen Fo mea. Wee Bde os 

We first discuss how to construct unrestricted nonlogical axioms of TWS) from 


Figure 4.1. Specification of Set-Int’ 
Operations _ 


Null  : — Set-Inti : as@ 
insert: Set-Int’ X Int’ > Set-int’ wee 
Remove : Set-Int’ Xint, — Sat-int’ 

Has : Set-int’ X Int > Bool . as X,€ x; 
Size : Set-Int’ > Int = as R(X) 
Choose _: Set-int’ — int ae re 


_ Restrictions | 
PretChoose(s)) :: ~ (#(s) = 0) 
Axioms 


1. Remove(@, i) = 

2. Removelinsert(s, anny i2) = ifi1 = 12 then Removals, i1) else ‘Insert(Remove(s, i2); 1) 
3.i€ 8 =F ‘ 

4. i1 € insert(s, i2) = if i1 = i2 then T else i1 Es 

5. #(8) = 0 

6. #(Insert(s, i) = ifi€ s then # (s) else a +1 

7. Choose(s}€s.= T 


- 120 - 


the formulas in the axioms component and the preconditions specified in S. We then 
discuss how to construct Th(S) from the nonlogical axioms thus obtained. We do so step 
by step exhibiting the power of various fragments of the deductive system. This will also 
help in investigating how easily these fragments can be automated. We first discuss a 
simple but useful subset of Th(S), called thé equational subtheory and written as EQ(S). 
Formulas in EQ(S) are proved using the rules of = and the substitution rule of v. Most of 
the work on developing the proof theory of data types from their meeoraic enone 
has focused on this subtheory (23, 71, 7, 21, P)) 

We discuss later a richer subtheory, called the distinguishability sine and 
written as DS(S), having inequalities ‘ ez e in addition to’ equations. The inability to 
prove an inequality has been a major limitation of the recent works on proof theories based 

‘on algebra specifications, For instance, both in Zilles’s method as well as in‘ ADJ's method, 
two terms e, and e, are unequal, i.e., e,# e; is provable, if and only if e, = e, is not’in the 
equational subtheory, so’ the’ proof of inequality becomes meta.’ Zilles [76] recognizes this 
limitation and suggests also using inequalities as axioms. In our deductive system, 
inequalities can be proved from equations by the methad-of proof by contradiction. We 
have this advantage because we view two abstract values (i.e., ground terms) of a data type 
to be distinguishable (so unequal) if and only if a sequence of operations can distinguish. 
them. This is in contrast to the view taken by the ADJ group a and Zilles a two abstract 
values are distinguishable if and only if they are not specified tobe equal: 

We later include an induction tule which captures the minimality property of a 
data type. This rule is ‘infinite’ and is derived from the syntactic: specifications of the. 
operations and the restrictions components of the specification. More properties of a data: 
type can be proved using the induction rule than without it. We discuss how. the rule is 
used to prove other rules using the nonlogical axioms derived from the specification, which 
simplify the proof of properties of the data type. The subset of equations and inequalities 
provable using the induction tule andthe. nules af the distinguishability subtheory i is called 
the inductive subtheory and written as IND(S). | 

We finally construct the full theory TWS) using the whole machinery of first 
order predicate calculus and the ‘infinite’ induction rule. We demonstrate.the use of TKS) | 


-121- 


in verifying properties of programs. Every subtheory (as well as the full theory Th(S)) is 
constructed hierarchically from the corresponding subtheory (or the full theory). 
constructed from the specifications of the defining types:and-the auxiliary types.used in S. 
For instance, IND(S) is constructed. from cated where S' is a specifics ication of 
D'€AVA,. 
~ In the last subsection, we define sufficient: completeness, completeness, and well 
definedness properties of a specification, and relate them to behavioral conipleteness and 


consistency properties discussed in Section 3.5. 
4.2.1 Derivation of Nonlogical Axioms 


The unrestricted nonlogical axioms for a specification S can be derived in a 
straightforward way. If S'specifies a nontriviat- precondition for some ‘operations, then the 
nonlogical axioms are generally conditional eqttations: “Let PC, starid for a conjufction of 
conditions of ‘the form ‘Pi (e,,..., ¢) = T for every ‘occurrence’ of « having the input 
Cees CN’, if an equation ‘ €, #¢, ’ is in the akioms coniponent-of 'S, ee 
nonlogical axiom of Th(S) is the lags Pine 

(PC, APC, date, =e ee 
For: Sadie ‘the formula 
Choose(s) € s = 
has an occurrence of the operation Choose, ‘which’ is spebified 1 have the nontrivial 
precondition, so the correspondig unréstricted nonilogicat axiom’ is 
© (© #6) = 02 T)> ChoosesyEs= TI 

“Ifa formuta’in the-axioms' ‘component ‘does ‘not have’‘any operation specified to. 
have a nontrivial precondition, then the formuta itself Srves'as the” ‘ndntdgical axiom: For 
example, the formula ? 

# (Insert(s,i)) = ifi€ s then #(s) else #@&) # bevvnyecs cee 

itself serves as a nonlogical axiom. 
. For any restricted quangberiee formula-4;. the: corresponding mincaicied 
formula is “PC = f°, where PC is a conjunction of the formulas PC é for’ every term é in. 


the formula f£. 


- 122 - 


4.2.2 Equational Subtheory 


The equational subtheory EQ(S) consists of. ‘equations derived from the 
nonlogical axioms of S. An equation ‘e, & e, is in EQKS) if and only if it is provable from 
the nonlogical axioms of S and EQ(S’), where S' is a specification of D’, for each. 
D €AvA, using the four rules of =, namely, 

(i) reflexivity, 

(ii) symmetry, 

(iii) transitivity, 

(iv) substitution property of every function symbol, 
and, 

 (v) the substitution rule for. the universal. quantifier v (i.e., substituting an appropriate 

term for every occurrence of a-free variable. in a nonlogical axiom), 
All five of the above rules are nat necessary;.some of them can be derived from the others 
_ [16]. As an illustration, wegive a proof of the equation. *# (insert{insert(Null, i), )).= 1. in 
Figure 4.2. 

EQ(S) defines a relation on ground terms of dire types; let EQ, stand for 
this relation on ground terms.of type D’. For any ground terms e and.e,, me e,2 € FQp' if. 
and only if ‘e, =.e,’ € EQ(S). 

If the nonlogical axioms, are equations (possibly using | if-then-else functions), they 
can be considered as unidirectional rewrite. rules.by,.defining.an appropriate ordering on 
terms. If a decision procedure for EQ(S) exists -(i.¢.,., the relation: EQ,y. for, each 
D'€ AU A, U {D} is decidable), then it is often. possible. to, generate a convergent set of 
rewrite rules from the nonlogical axioms. using the Knuth-Bendix algorithm [44], which - 


Figure 4.2. Proof of ‘#(insert(Insert(Nall, ), Y= Poo 


1. i € Insert(Null, i) = T Substitution in Rulon 4 of Set- Int’ and the theorem of Int 
2. #(Unsert{Inscri(Null, i), i)) = #(Insert(Nell,i)) Stop I. ‘substitution i in Axiom 6 of Set-Int’ 

3. - = #(Null) +1 . Axiom. 3 su | in axiom-6 of Selr Int’, and transitivity. 
4. 20+1 : "Axiom 5 of Set-Int. 

5. #1 Theorem of Int. 


-123- 


constitutes the decision procedure for EQ(S). The AFFIRM system [60] is designed in part 
around this result. Though nonlogical axioms using if-then-else functions have been 
studied [60, 21,5], there appears to be some difficulties in using the Knuth-Bendix 
algorithm on them {6]]. 

For automating the process of proving properties from the nonlogical axioms of S 
using the above five rules, it may be helpful to view a formula of the form 

PC => (e, = e), 
where PC is a conjunction ‘b= TA...A 6 =T as the formula 

= if 5, AowdhK 5 then e, 
as the two formulas are equivalent and the second formula can be considered as a rewrite 
rule. For example, 
— (~ #(s) = 0 = T) => Choose(s) € s = T 
can be viewed as 
Choose({s) € s = if ~ #(s) = 0 then T. 


4.2.3 Distinguishability Subtheory 


The distinguishability subtheory DS(S) is richer than EQ(S); it has two kinds of 
formulas: (i) ‘Ge ey and (ii) , # e. Our approach for proving inequalities is simple; it is 
based on the definition of distinguishability discussed in Sections2.2 and 2.3. The 
distinguishability theory of Bool serves as the basis; since “T # F’ is a formula in the 
specification of Bool, ‘T # F' € DS(Bool). (Recall that only the specification of Bool 
includes an inequality as an axiom.) ‘T # F’ obviously holds in every model of Bool. This 
inequality is used to prove inequalities of terms of type D by reductio ad absurdum (proof 
by contradiction); this is the sixth logical rule, besides the five rules discussed in the 
previous subsection, which is used to construct the subtheory DS(S). We of course use 
inequalities in DS(S’), where S' is a specification of D’ € A U A. 

Given two terms A ande,, we prove ‘e, # eo as follows: 

We assume on the contrary that e, =e; 


2 


we then derive ae = e *, where e # e, is already provable, i.e., either 


‘e £e,’ € DSS), ore, # e,’ € DS(S). 


- 124- 


We illustrate the above rule to prove the inequality ‘Null # Insert(s, yi in Figure 4.3. For 
any pone terms e, and e .» the formula * e# e, inept in a a model i in FS) to whether 

The ‘acthod of cae by contradiction can be nerael into a rewrite rules 
system like AFFIRM. If an inequality ‘e, # e,’ is to ‘be proved, we assume ‘e, = e, as an 
axiom and add it to the set of nonlogical axioms. We get the rewrite rules corresponding 
to the new set of axioms and run them to check whether a contradiction, i.e., one of the 
rules ‘T—F' and ‘FT or “en é;. is generated, where the inequality * é He e * is s already 


proved. 
4.2.4 Inductive Subtheory 


The subtheory DS(S) is still not rich enough because there are many useful. 
equational formulas which hold for every data type-in, O{S), but.cannot:be proved using, the 
logical rules of DS(S). For example, the equation 

Has(Remove(s, i), i) = F 

cannot be proved because 

(i) there is no nonlogicat axiom directly expressing a Leal of Has. on a set argument 
having the structure Réemove(s, i), and | 7 

(ii) Remove(s, §) is not equivalent to Null or an expression of the forin Insert Py uriless 
some conditions are placed on s. 
But, ‘Has(Remove(s, i), i) = F holds in every model ‘in a ‘Even if we use the 
whole deductive system of first order predicate calculus, this formula cannot be proved 
from the nonlogical a axioms of Set-Int. . 


Figure 43. Proof of Null # Insert(s, i) 


To prove Null # Insert (s, i) 

assume Null = Insert (s, i) fades 
Has (Null, i) = Has(Insert(s, i), i), substitution property of Has 
FsT, _ the axioms.3 and’4 of Set-int’, 
which is a contradiction. 

so Null # Insert(s, i) € DS(Sct-Int’). 


- 125- 


The above limitation is due to the fact that the minimality property of data types, | 
which is captured in the definition of a type algebra, is neither captured in the underlying 
logic nor expressed as a nonlogical axiom (see the discussion of the minimality property in 
Section 2.1). We discuss below an induction rule which captures this property. The rule 
can be constructed from the syntactic specifications of the operations in S. We compare 
our rule with other similar rules proposed in the literature, and demonstrate the inadequacy 
of some of these rules. We discuss how the ‘infinite’ rule can be used in proofs. For better 
exposition, we first assume that no constructor of D is specified to have a nontrivial 


precondition by S; we later relax this restriction. 
4.2.4.1 Infinite Induction Rule 


Def. 4.1 A ground term e is called a constructor ground term if e is expressed only using 


constructor symbols. § 


(+) Induction Rule 
Given a formula (x) with a free variable x of type D. 
For every constructor ground term e of type D, [x/e] + (v x) (x). 


The above inference rule is infinitary, as there are usually infinitely many constructor 
ground terms of type D and so, the rule requires infinitely many premises. The notion of a 
proof is infinitary whenever the induction rule is used. Intuitively, the above rule states 
that if a formula (x) holds in every case when a value of type D is substituted for x, then 
we can deduce the formula ‘(V x) ®(x).’ It is easy to see that the above rule is sound 
because every type algebra by definition has the minimality property, which states that 
every value of D is represented by some constructor ground term of type D. It is sufficient 
to consider only constructor ground terms because these represent every value in a type 
algebra. 

Burstall and Goguen [7] also realized the limitation of the proof theory based on 


- 126 - 


the rules of =. 2 They niodubed the induce operator on theories, the induced theory is 
equivalent to the original theory with the above induction rule. The above induction rule is 
a generalization of the structural induction rule of Burstall (6. ‘The structural induction — 
rule is based on identi fying a minimal set of constructors (instead, of all constructors) which 
generates the values of D and has the property that every finite sequence of constructors in 
the subset generates a distinguishable value. To our knowledge, Wegbreit and Spitzen [72] 
were the first to. generalize the structural induction rule, but they presented it informally. 
The data induction rule of Guttag et al. [29] i is the same as the induction tule of Wegbreit 
and Spitzen. Recently, Musser [61] has suggested a formalization similar to our 


formulation of the rule. 


4,2.4.2 Rationale for an Infinite Induction Rule: 


Below, we discuss the rationale for using an_ infinite rule to capture the 
minimality property of a data type. We demonstrate the inadequacy -of an induction 
scheme seemingly suggested by Wegbfeit and Spitzer [72], Guttag et‘al: [29], and Nakajima 
et al. [62]. For illustration, we use a simple ‘version of the’ data type natural number, 
denoted by No. N> has four operations: 0, the constant zero; S, the successor operation; P, 
the predecessor. operation; and, =, the. equality, Operation. Its specification is given. in. 
Figure 4.4. The constructor P is derived i in the. sense that, the values returned by P can be. 
constructed using 0 and S. We would like to prove from.the nonlogical axioms of N>.and. 
the induction rule, the following normal. form lemma in the full theory: 

Q) Wx [x20 VGyY[xsSQ]L - 
In general, we would like to have in Th(N>) the scheme 
(2) (@(0) A(V x) [ Ox) = (SG) D = (VX) O02), 
where @ is a first order formula with at least one free variable. 
If we express the minimality property of N» with the following scheme: 
(3) (@(0) A (¥ x) [ (x) = (O(P(X)) A &(S(x))] ) => (¥ x) (2), 


2. However, ADJ [71] do not seem to agree that propertics provable using the induction rule are relevant. 


-p7- 


Figure 4.4. Specification of Data Type Ny 


Operations 


0 > —+No 

Ss > Ng Nog 

Pp > No — No 

= : No X No — Bool 
Axioms — 

P(O) = 

P(S(x)) = x 
x2x=T 
Xaysyex 
S(x) = O= F 

S{x) = Sly)=x=y 


where @ is a first order formula, we can neither prove (1) nor (2), This is because there are. 
nonstandard models of the nonlogical axioms given in Figure 4.4 and. the scheme (3), in 
which the scheme of formulas (2) and/or the formula (1), do. pot hold. Figure4.5 is one 
such model in which the nonlogical axioms as, well as, the scheme (3) holds but the formula 
scheme (2) does not hold. The model has an infinite chain aging from a, constant symbol c 
in both directions in addition to the chain of natural numbers, and there, is a unary 
predicate symbol M whose interpretation in the model is the,predicate which, is false an, all 
constants on the negative side of c, and true otherwise. The ee shows the values in the 
models on which the interpretation of Mis false. ty: 2s cee 

The scheme (3) does not capture the property that the operation P when applied 
on any natural number ‘will hit in finitely many steps’ ‘dither 0 ora riumber that behaves aa 
~ 0 (in nonstandard models). This property is treediéd to derive Q) or Ww. ~— 

It should be obvious that the scheme: (2) as well asthe formuta: (1) hold in every 
model in F{N>). Formutas of the kind (2) and the forinula (1) are very useflit in proving 
properties of programs using N5. For example, using the formula scheme (2), the proof by 
induction amounts to checking for the basis condition and a single case in the inductive 
step, where as (3) requires two cases in the inductive step. oe pe eae 

We -would like’ the induction rule ‘tobe constructible from the syntactic 


- 128 - 


Figure 4.5. A Nonstandard Model of the Axioms in N with the Scheme (3) 


Ss Ss Ss 
0->-- 1-2 ->- 3 = 4 
rn a oe 
P oP Pp 
S S Ss Ss S 
wee OD OL HD CD +I +2... 
Ce one o <- 
P P P Pp P 


specification so that the rule does not have to be stated explicitly for every data type in its 
specification. In addition, the induction rule should be strong enough so that, for example, 
the formula scheme like (2) and the normal form theorem (1) can be derived in case of N>. 

The above discussion shows that the scheme (3) is not powerful enough. However, the 
infinite induction rule (+) for N» does the job. It can be shown that the scheme (2) and the 
formula (1) are derivable from that rule. 

Another alternative for characterizing the ‘minimality property is to use 
multisorted second order predicate calculus as the ‘underlying logic and express the 
minimality property as a second order formula. But, this approach i is not attractive because 
of the reasons cee in the first section. 


4.2.4.3 Use of the Induction Rule 


For using the induction rule (+), we must establish infinitely many premises. This 
can be done by imposing a partial ordering on the set of constructor ground terms and 
using induction on ground terms. We discuss below a technique: for doing this. We start 
with an instantiation of this technique which uses the structure of the ground terms; this. 
method is known as the structural induction [6], We show that 

(i) for each basic constructor o : D, x...x D, + D, which does not take any argument 
of type D, o[x/o(e tees eI is provable, and . 
(ii) for every other constructor o € Q, &[x/o(e,,..., e dis provable assuming o[x/e] for 


- 139- 


every D. = D. 
However, there are situations when the structural induction is not useful or convenient; 
instead, a different partial ordering on ground terms.is preferable. 

We present below a generalized technique. Let G stand for the set of all 
constructor ground terms of type D. We can define an ordering relation (non- “Teflexive, 
antisymmetric, and transitive) < on G such that (G, re) satisfies the minimum condition. 


Defining < on G gives a getieralized (Noetherian) induction rule [10] on G, 


Def. 4.2 (G, <) satisfies the minimum Ones iff for every nonempey subset A of G, A has 
a minimal element with tespectto<.3 8 


Generalized Induction Rule: 
If for every e € G such that for every element. é € G that i is 3 < e, eolv/e => olx/4 
then (v e € G) O[x/e. . 
So, in order to establish the infinitely many premises of the ‘infinite’ induction rule (), we 
define a partial ordering < on the constructor ground terms jin G such that G, 9 has the 
minimum condition and use the generalized induction rule. | 
Using the nonlogical axioms of S, one can identify a subset ¢ of G such that for 
every constructor ground term e € G, there is a ground term é in C such that 
‘e=e’e€ EQ(S). We can then simplify the induction rule using the following tule of first 
order predicate calculus: 7 sae . — 
(e= e)t O[x/ed a olx/e] 
We need to show only that for every ground term e € C, ofx/d. For example, it can be 
shown in case of Bool, that for every boolean ground térm 2 ‘either * e=T € EQ(Bool) or 
‘e=F € EQ(Bool). So to prove a property having a free ‘variable of type | Bool by 
induction, it suffices to show that the property holds in case of T and F. 
Let us consider the example of Set-Int’. The ir induction tule (¢) for Set-Int’ is 


3. The property of a sect A satisfying the minimum condition with respect to-an ordcring relation < is related 
to the well foundedness property of A with respect to <. It can be shown that A is well founded with respect to 
< if and only if (A, <) satisfies the minimum condition, 


- 130- 


For every constructor ground term e of type Set-Int’, @[x/e]  (v x) (x). 
The following theorem establishes that the constructor Remove is derived in the sense that 
it does not construct any value of Set-Int’ distinguishable from’ the values. constructed by 
Null and Insert. ne 


Thm. 4.1 Every constructor ground term e of type Set-Int’ is equivalent by equational 
reasoning to a ground term e’ not haying any occurrence of Remove, i.e... the equation. 
e=e'’ € EQ(Set-Int’). 


Proof Using induction on the number of Remove (and subsequently the number of Insert) 
in a constructor ground term, we show the above with the help of the axioms ] and 2 of 
Set-Int’. For details, see eee Il. 8 


Using this theorem, we get a simpler induction rule for Set-Int’: 
(4) For every constructor ground term e of type Set-Int’ -having only the occurrences of 
Null and Insert, ax/el - (v x) (x). 
We can define an ordering generated by the following relation on ‘ground terms 
constructed using Null and Insert. 7 
Null < Insert(x, , and x< Insert(x, Hy] 
for any constructor ground term x and integer constructor ground 1 term i, Using the 
induction rule (4), we can prove for any formula @, 
(5) ([x/Null] A (v x) [ ©(x) => (v i) o(Insert(x, ))]) => Cw. 40. 
We also get the following normal form theorem for Set-Int’ using (5) 
(vs) [ss Null) V (as) i°) s= Insert(s,i)} _ 
Note that the above formula i is different from Theorem 4. 1, (The above formula is not in 
IND(S) because of the use of the existential quantifier | in it, but i it is in TINS) as discussed. 
later.) Theorem 4.1 cannot be expressed in first, order predicate calculus. Using the 
scheme (5) and the nonlogical axioms of Set-Int’, we prove ‘Has{Remove(s, i), i) = F in 
Figure 4.6. Recall that this formula could not be proved in DS(Set-Int’),. 
The inductive subtheory IND(S) consists of equations and inequalities, and is 
defined to be the set of formulas derived from the nonlogical ‘axioms using the six rules 
discussed in the last subsection (meaning DS(S) ¢ IND(S)) and the infinite induction rule 


- 131- 


Figure 4.6. Proof of ‘Has(Remove(s, i), i). = F 


We usc the formula scheme (5) above. 

Basis. Has(Remove(Null, i), i) = Has(Null,.i) = F a oe Axioms 1, 3. 

Inductive Step Assume Has(Remove(, i), i) =F, . . , 
to show (¥ ik) Has{Remove(insert(s, j1), D, i) =F]: 


Case lii= il. 
‘Has(Remove(Insert(s, i), i), i) = Has{Removets, i), d= =F, Axiom 2, and the assumption. 


Case 2: ~ (i = il) 
Has(Remove(Insert(s, i), i), i) = Has(Insert({Remove(s, i), i]),i) . Axiom 2. 
= Has(Remove(s, i), i) = F Axiom 4 and the assumption. 


Using the scheme (5), we get Ha(Remove(s, i), i) = F. 


(+). We later discuss the conditions under which formulas in IND(S) can be proved using 


the Knuth-Bendix algorithm (Subsection 4.2.7). 
4.2.4.4 Specifications with Nontrivial Preconditions for Constructors 


The induction rule (+) is also applicable to specifications, specifying nontrivial 
preconditions for the constructors as it captures a general property’ ‘of data types and riot a . 
property of specifications. It can be simplified depending on the semantics ah ee for a 
' constructor o on inputs not satisfying its precondition. 

If nontrivial preconditions are specified’ for constructors, we are interested ‘in 
constructor ground terms in which the input to every constructor invocation satisfies the” 

specified precondition. This is so because a constructor is not likely to.be invoked swith: an 
input not satisfying the specifi ied precondition. Even if the constructor is invoked on such 


an input, we are not interested in its behavior. 


Def. 4.3 A constructor ground term e is called /egal if and only if (i) e does not have. any . 
occurrence of an auxiliary function, and (ii) for every subterm of e of on 
e,_), where o is a constructor, ‘P po Creer e = A 3 EQ(S). 


e = o(€,,,---5€ »€ 


The restriction that ‘P AC Onreces é,,) = T’ € EQ(S) is for convenience; we. could bis 
required the formula to be in Th(S), the full theory constructed from S. (Recalt that P(X) 


. -132- 


is a boolean term without involving any quantifier.) ‘We are mostly interested in formulas 
involving legal ground terms. 

Assuming the semantics used in Chapter 3 (ic., on an input not satisfying its 
precondition, o returns a value of D constructible. by the constructors of D using inputs 


Figure 4.7. Specification of Stk-Int 


Stk-int as Stk 
Operations 


Null : — Stk 
Push : Stk Xint — Stk 
. — overftow(Stk, int) 
Pop : Stk — Stk 
Top : Stk — Int 
—~ no-top0 
Replace : Stk X Int — Stk 
Empty : Stk — Bool 


Auxiliary Functions 7 
Size. : Stk — Int : as #(x) 
Restrictions 


PrePop(s)) :: ~ Emptyls) 
PredReplace(s, i) :: ~ Empty(s) 


Empty(s) = Top(s) signals no-topQ 
Push(s, i) signals overflow(s, i) => #(s) > 100 


Axioms 


1. Pop(Push(s, i) = s 

2. Top(Push(s, 1) = i 

3. Reptace(s, i) = Push(Pop(s), » 
4. Empty(Null) = T 

5. Empty(Push(s, )) = F 

6. #(Null) = 0 

7. #(Push(s, i))= #(s) + 1 


ae Kis 


satisfying their preconditions* ), the induction rule (+) gets simplified to 
for every legal constructor ground term e of type D, @[x/e] H (v x) (x). 
This is so because every constructor ground term that is not legal is equivalent to some legal 
constructor ground term by the above assumption. 
If the above assumption about the behavior of o is dropped and nothing is 
assumed about its behavior on inputs not satisfying the preconditions, then we have 
- for every legal constructor ground term e of type D, @[x/e] F 


(WoC Vi @x,....x )[xsolx.....% JAP, (%..-..% STD - 
1= i,m 1 n, ] n, i 1 n, 


i i 
=> (x), ; 

where { ee a } is the set of constructors of D. The condition in the matrix of the 
consequence of the above rule ensures that x ranges over values serving as the 
interpretations of the legal ground terms of D. This is the strongest consequence we can 
have because the interpretation of illegal constructor ground terms is not known. For 
example, if we drop the restrictions in the specification of Stk-Int repeated in Figure 4.7 
specifying the exceptional behavior of the operations, the modified specification associates 
preconditions with the constructors Pop and Replace. The induction rule would then be 

for every legal constructor ground term e of type Stk-Int, o[s/e] 

(vs) (s = Null) v (Gs, i’) s = Push(s’,, i’) v (a s’) [ ~ Empty(s’)) = T As = Pop(s’) ] 

V (as, i) [ ~ Empty(s’) = T A s = Replace(s’, 7) ]) => (s). 

We have discussed in Chapter 3 the reasons for assuming that a constructor o on 
an input not satisfying its precondition can either signal an exception or return a value 
constructible by the constructors using inputs satisfying their preconditions. An additional 
reason for this assumption is that otherwise the induction rule gets complex, as should be 


evident from the above discussion. 


4. o can also signal on such an input; since we are considering data types without exceptional behavior, this 
choice is ruled out. 


- 134- 


4.2.5 The Full Theory 


In proving properties of programs, one often uses, properties of data types other 
than equations and inequalities. For example, we often. need to prove properties of the 
form ‘(e,, = ¢, A... Ae = Co, > G =f). Or, we may need a formula involving 
existential quantifiers. For example, consider the union procedure on sets of integers 
written in a CLU-like language and given in Figure 4.8, The integer variable i inside the 
loop defines the range (-i+ 1, i-1) of integers which have been checked to be members of 
the first argument and if $0, have been inserted into the result being computed. The 
variable i is incremented every time the loop is executed. To prove the termination of 
union, we need to show that a set is either empty or there is an integer k such that every 
element of the set lies in the range (-k, k). The following formula expresses this property: | 

(6) (vs)[s=Null v (ak) (v)[Mass,J=T=>(j<kAj>-k)]} 
To prove such properties; we need the whole machinery of first order predicate calculus 
with identity. The proof of (6) is given in Figure 4.9. - a | 

The full theory Th(S) is the set of formulas derivable from the nonlogical axioins 
of S and Th(S’), where S’ is a specification of a defining type or an auxiliary type used in S, 
using the logical axioms and rules of inference of multi-sorted first order predicate calculus 


Figure 48. Procedure Union - I 


union = proc(sl, s2 : Set-Int’) returns (Set-Int’) 


i: Int:=0 
rl : Set-Int’ := s1 
52 : Set-Int’ : =. 82 


"while ~ Sct-Int’$Size(rl) = 0. do 
if Sct-Int’$Has(rl, i) then rl := Sct-Int’$Remove(rl, i) 
12 := Sct-Int’$Insert(r2, i) 
end ; 
if Sct-Int’$Has(rl, -i) then rl := Set-Int’$Remove(rl, -i) 
r2:= Sct-Int'$insert(r2, -i) 


i:=i+1 
end 
return (12) 
end union 


- 135- 
Figure 4.9. Proof of the Formula (6) 


To prove (V s)[s = Nulk) V (3 i) (Vj) [ Has(s, j) = T=> (j <i Aj>-i)]] 
Using the scheme (5), 


4(s) = [s = Null V (3) (V j) [ Hass, per =>(jSiAj> aT 
Basis (Null) <=> T 


Inductive Step Assume (s), to show (Vk) ®(Insert(s, k)) 
Since @&s) «= T, we have two cases, 


Case I s = NulK) 
(Insert(Null, k)) & T, because i is |k], the absolute of k 
Case 2 (i) (VW ) [ Hasls, j) = T => (J Ci Aj>i)] 
Subcase 1 -i Qk Si, 


i itself serves to prove that (Insert(s, k)) <.T from (s) 
Subcase 2 k >i V k<-i 


|k| serves as i to prove that ®(Insert(s, k)) = 'T from $s). 
Using the scheme (5), we have (V s) ®(s) 


with identity, as well as the infinitary induction rule @). 


The following diagram summarizes the relationships among different subtheories 
and the full theory: oe 


Th(S) First Order Predicate Calculus. : Infinite ‘adubtioa = 
INDCS) + In finite Induction Rule | 
- Dsis) 7 + Proof by Contradiction 
08) 


- Four Rules of = and the Substittition Rule of v 
The following theorem shows s that the above deductive e system is sound. 


- 136 - 


Thm. 4.2 For any two ground terms e, and e,, 

(i) if ‘e=e€ Th(S), then. é and e, are observably equivalent by S (i.e., observably 
equivalent in the models in F{S)), and 

(ii) if “e, # e, € Th(S), then e, and e, are distinguishable by'S. 


Proof The theorem follows from the facts that (a) the nonlogical. axioms. hold in the 
models in F(S) with = interpreted as the observable equivalence retation, (b) the 
observable equivalence relations are preserved by the functions i in the models in F‘S). - 5 


4.2.6 Properties of a Specification 


We can define properties desirable of a specification by requiring that various 
subtheories and the full theory derived. from the specification. satisfy certain conditions. 
Guttag and Horning [28] have discussed the sufficient completeness property for a 
restricted class of specifications, which has been found useful We state that property in 
our framework. We extend it to specifications using auxiliary functions and specifying 
preconditions for the operations, The sufficient completeness property captures the 
intuitive notion’ that the behavior‘ of the obsérvers ‘is’ ‘Completely. ‘specified on intended 
inputs and that the result of an. observer onan intended inpyt.can be deduced by 
equational reasoning. We relate this property to the behavioral completeness property 
defined in the previous chapter arid show that sufficient completeness i is stronger than 
behavioral completeness. (Theorem 4.4) because behaviogal.completgness only requires that 
the behavior. of the observers be completely specified on intended inputs ; and it does not 
say anything about what can be deduced from the specification. 

When specifications are used to prove: properties of programs using the data types 
being specified, we often need to relate different constructor sequences. In that case, it is 
desirable to have a specification satisfy a stronger property than sufficient completeness, 
which in addition to the requirement that the behavior of the observers can be deduced by 
equational reasoning on any intended input, also requires that the equivalence of the 
observable effect of different constructors can be deduced by equational reasoning. We 
call this property the completeness property of a specification and define it precisely. We 


- 137- 


later see that for a complete and consistent specification S, formulas in IND(S) can be. 
proved using the Knuth-Bendix algorithm (see Subsection 4.2.7). 

Recall from Section3.5 that for a consistent and behaviorally complete 
specification S, the models in F(S) are behaviorally equivalent w.r.t. { P, Jo€ Qh. 
Furthermore, if S does not specify any nontrivial precondition for the operations, the 
semantics of a specification S is a single data type, a set of behaviorally equivalent algebras. 
In that case, for any two ground terms of type D, they are either observably equivalent by S 
or distinguishable by S_ An obvious question is whether the proposed deductive system is 
| powerful enough to deduce this from a consistent and behaviorally complete specification. 
We show that it is not the case. But if a specification is consistent and complete, then the 
deductive system has this property. 

Since S is hierarchical, S should preserve the spetiiications of the types used in S. 
S should only specify the behavior of the operations of D, and it should not specify the 
behavior of a type D’ used in S that is not captured by its specification S’. Specifications so 
designed are modularly structured; they support the factoring and hierarchical structuring 
of the proof of correctness of a hierarchically designed implementation. We define the well 
definedness property of a specification which captures this modularity requirement. 


Before we discuss these properties, we prove 


Thm. 4.3 For a consistent S, for any two ground terms é, and é, of the same type, both e 
= e,' and ‘e, # ,’ cannot be in Th(S). 


Proof If S is consistent, then F{S) # 2. 

Suppose for some e, and e » both * eee, and ‘ e# e, are in Th(S). ‘¢= me, € Th(S) 
implies that é, and é, are a equivalent by S. Similarly, ° e# e, € Th(S) implies 
that e, and e, are distinguishable by S, which is a contradiction. & 


- 138 - 


4.2.6.1 Sufficient Completeness 


As was said earlier for constructors, for a specification specifying nontrivial 
preconditions for the operations, one is interested in ground terms in which the input to 
every occurrence of an operation symbol satisfies the associated precondition. This is so 
because an operation is not likely to be invoked with an input_not satisfying the specified 
precondition. Even if the operation is invoked on such an input, we are interested in- its 
behavior. Furthermore, if a specification uses auxiliary functions, grouad-terms in which 
auxiliary functions appear are also not of interest because they are not used in programs 
using the data type. Earlier we defined a legal constructor ground term (Def. 4.3); below, 


we extend the definition to a ground term. 


Def. 4.4 A ground term e is called /egal if and only if ()-e does not have any occurrence of 
+. @), 
1 


an auxiliary function, and (ii) for every subterm of e of the form e, = ¢ (e, 


Ww' 
where « € Q, ‘P(e, ; oe) = T € EQS). 8 


ieee 
For a specificatior using auxiliary functions and specifying nontrivial preconditions, only 
legal ground terms are interesting. If such a specification is consistent and behaviorally 
complete, any two legal ground terms are either -observably equivalent by S or 
distinguishable by S(seeSection3.5). = |. 

In [28],.Guttag and Horning define the sufficient completeness property of 7 
specifications which do not specify a nontrivial precondition for the operations and do not | 
use auxiliary functions. We state their definition in our framework. _ 


Def. 4.5 A specification S is sufficiently complete if and ‘only if for every ground term e of 
type D’ € A, there exists a theorem derivable from S$ of the form ‘e = ¢’, where ¢ isa 
ground term of type D‘ without any occurrence of an operation symbot of D. # 


In [28], the deductive system to be used to derive a theorem is not specified. Guttag [33] 
requires that the equation ‘ e = e'* be in the equational subtheory EQS). . 

The sufficient completeness property can be extended to specifications using 
auxiliary functions and specifying nontrivial preconditions for the operations. For auxiliary 


functions, there are two possible extensions: 


- 139 - 


(i) Consider only the ground terms expressed using the operation symbols, because only 
these terms can be used in a program, or 

(ii) consider all ground terms, thus requiring that auxiliary functions also be espe 
specified. . 
We take the former approach; however, we recommend that whenever an auxiliary 


function is used, it be completely specified. 


Def. 4.6 A specification is sufficiently complete if and only if for every legal ground term e 
of type D' € A, a formula ‘e = e* € EQ(S), where e’ is a legal ground term of type D' 
without having any operation symbol of D or any aoxltary function. | 


For example, the specification of Set-Int' is not sufficiently complete, because for instance, 
a legal ground term C hoose(Insert(Insert(Null, 1), 2) cannot be related to any ground term 
of type Int that does not have any occurrence of an operation symbol of Set-Int’, 

~ The following theorem relates sufficient completeness to behavioral 
completeness. The intuition behind this result is that if the behavior of observers on 
intended inputs can be deduced by equational reasoning from S, then the observers: must 
be completely specified by S. 45 n 


Thm. 4.4 Ifa specification S is sufficiently complete, then S is behaviorally complete. 
Proof: See Appendix ul. | | saa 


The converse of the above theorem however dees. not hold. So, the sufficient , 
completeness Property is strictly stronger than behavioral completeness, as there are 
specifications which are behaviorally complete but are not su ifficiently complete. This i is SO 
because in the definition of sufficient completeness, only a fragment of the. deductive 
system of first order predicate calculus is used to derive properties from the specification. 
There can exist a legal ground term-e.of type BD’ € A'stich that we cannot derive * e = e’ for 
any e' of type D’ not having any occurrence of an operation symbol of D,in the equational 
subtheory EQ(S). However, we can derive the above equation in Th(S) using other rules in 
addition to the rules of ‘the equational subtheory. We illustrate this point using the 
specification of Set-Int’.’ We add another axiom defining Choose on sets of size > 1 as 


- 140 - 


returning the maximum integer in the set. 
8. Choose(Insert(Insert(s, i1), i2)) = if Size(s) = 0 then. (if ~ il = i2 then Max(il, i2)) 
else (if ~ il = i2 then Max(Choose(Insertfs, il)), i2) else Chooese{Insert(s, it))). 
The modified specification is not sufficiently complete, because Choose (Insert(Nall, i)) is 
not directly specified. Nor can we deduce by equational reasoning — that 
‘Choose(Insert(Null, i)) = i... However, using the theorem of Int, “G = j = T) => is j’ 
derived using the induction rule for integers, the axioms 3, 4, and 7 of Set-Int’, and case 
analysis, we can prove by contradiction that _ 
| Choose(Insert(Null, i)) si 
It should be obvious that with a minor modification of the proof of Theorem 4.4, we can 


prove the following generalization of Theorem 4.4: © 


Thm. 4.5 If for every legal ground term e of type D’ € A, ‘there exists a ground term ¢ of 
type D’ not having any operation symbol of D and auxiliary function such that‘e=e¢’e€ 
Th), then S is behaviorally complete. a 


Theorem 4.4 can be derived as a corollary of the above theorerh. We conjecture that the 
converse of the above theorem is also true, which says that the deductive system is 
complete with respect to deducing the behavior of an observer on an intended input. 


Conjecture 4.1 If S is behaviorally complete, then for every legal ground term e of type D’ 
€ A, there exists a ground term eé’ of type D‘ not having any operation symbol and auxiliary 
function such that ‘e = ¢&” € Th{S). : 


We can prove ‘the following partial completeness result about the deductive 
system in proving the distinguishability of legal ground t terms $ of type D, D' eau{ D }. 


Thm. 4.6 For a consistent and sufficiently comiplete’S: if any two legal ground terms e, and 
e, of type D are distinguishable by.S, then ‘e, # e; € DSS). 


Proof See Appendix Ill. 8 


If conjecture 4.1 is true, then we can prove a similar result about behaviorally complete. 
specifications: For a consistent and behaviorally complete specification. S, if any two legal 


- 141- 


ground terms e, and e, of type D are distinguishable by S, then ‘e, #6, € Th). 
4.2.6.2 Completeness 


We cannot prove a similar result about the observable equivalence of legal 
ground terms of type D, because we do not have a rule analogous to proof by contradiction 
in the deductive system that enables us to prove the observable equivalence of ground 
terms unless explicitly specified by. the nonlogical axioms, _ Different but equivalent 
specifications of the same data type can differ in the extent to which the observable 
equivalence relation of legal ground terms of D can be proved: fsom the nonlogical axioms. 
For example, the terms Insert(Insert(Null, 2), 2) and Insert(Null, 2) are observably 
equivalent by Set-Int’, but ‘Insert(Insert(Nail, 2); 2) = ‘Tesert(Natt, 2y ¢ Gee Int} If we 
add the following axiom to the specification of Set-Int’: 

9. Insert(Insert(s, if), 12) = if it = i2 then Insett(s; il) else Insert(Insert(s, i2), 71), 
then ‘Insert(Insert(Nulf, 2), 2) = Insert(@Nult, 2) € EQ(Set-Int’). The semantics of the 
modified specificetion is the same as the semantics of the original specification of Set-Int’, 
The more a specification of D captures the observable equivalence relation on terms of type 
D, the moré useful it is in deriving the theory of D and hence in‘ proving properties of 
programs using D. We define below a property of a specification requiring it to.completely 


specify the observable equivalence relation, We put a.stronger requirement: We want. 


EQ(S), instead of Th(S), to have a formula ‘e.  & &,-for two legal ground terms e,, e, if and 
only if e, and e, are observably equivalent by S, so that such formulas can be derived by 


purely squanonal rennin (i. e., using the rules of and he substitution tule for v). 


Def. 4.7 A sufficiently poe specification Si is: Seis if ard oul if assuming és the 
specification S’ of each D' € A U A, is complete, for any two legal grounds termse, and e, 
of the same type, ‘e, = e, € EQS) if and only if e, and e, are observably cquivalent by S. 
The completeness property of a specification should not be confused with the completeness 


property of a theory of an algebraic structure as defined in Logic [16]. Using Theorems 4.4 
and 4.6, and the fact that for a consistent and behaviorally complete specification, any two 


{ 


-142- 


legal ground terms are either observably equivalent: or:distinguishable by S, we have 


Thm. 4.7 For a consistent and complete specification S, for any legal ground terms e, and 
e, of the same type, either ‘e, = é, € DS(S) or ‘e, # e, € DS(S). 8 


Musser [61] has called a specification from which either ‘e, =e, ore, £ e can be 
derived in DS(S) to be fully specified, though his view of a specification is somewhat 
different. He views the operator ‘=’ as another operation of a data type, whereas we 


consider ‘=’ as a predicate in the underlying logic used to construct formulas. 
4.2.6.3 Well Definedness 


We would like a specification S to be modular, i.e., for-the specification S' of each 
D'€ AU A,, ThS) ILS’) = Th’). This means that Th(S) does not have a formula 
expressed using symbols in L{S’) that is not in Th(S'). Only those properties which involve 
an operation symbol of D and/or auxiliary functions used in S can be proved from S; a 
formula not having any operation symbol of D or an auxiliary function in S and not in 
Th(S’) cannot be proved from S. 

For a consistent and sufficiently complete specification, the following holds: 


Thm. 4.8 For a consistent and sufficiently complete S, for any fegal ground terms e,, ¢, of 

type D’ € A constructed using the symbols in L(S), if neither ‘ewe, € TS’) nor 

: e# e, € THS’), where S’ is a specification of D’; * é # ¢, *é THS). 

Proof By contradiction. | 7 
Suppose ‘e, # e,’ € Th(S) meaning that e, and e, are distinguishable by S (as well as by 

S') (by Theorem 4.2). By Theorem 4.6, ‘ e, £ ¢,' € THS), which is not the case. So the 

theorem. & 


However, we could have a specification S such that cH = e, € Th(S) in the above case. The 


following property of a specification rules out such cases. 


- 143 - 


Def. 4.8 A specification S is well defined if and only if for every D' € AU Aj, assuming that 
S' of D' is well defined, Th(S) ILS’) = Th(S’)). 8 


We are usually interested in well defined and complete specifications. 
Behaviorally incomplete specifications are occasionally of interest. Set-Int’ is such an 


example. 
4.2.7 Automation of IND(S) 


Recently Musser [61] has discussed how to automate IND(S) when S satisfies 
certain conditions. If (i) S is consistent and complete, and (ii) the nonlogical axioms 
derived from S can be written as equations (possibly using if-then-else operator), then the 
Knuth-Bendix algorithm, which treats equational axioms as rewrite rules, can be used to 
derive an equational formula e =e, in the inductive subtheory IND(S). The equation 
‘e, =e, is input to the algorithm as a rewrite rule to get a new convergent set of rules 
having the added rewrite rule. There are three possibilities: 

(i) The algorithm succeeds implying that the new equation is consistent with the 
nonlogical axioms and thus provable, . 

(ii) an inconsistency, such as ‘ eo e where e, and e, can be proved to be not equal, in 
particular ‘T — F or ‘F = T,’ is generated as a rule, implying that the equation is not a 
theorem, and 

(iii) the algorithm does not terminate implying that (a) an additional lemma be’ proved 
first, which could be guessed from the set of new rules generated, (b) the specified ordering 
on terms used by the algorithm does not work, and some other ordering needs to be tried, 


or (c) there does not exist a finite convergent set of rules to express IND(S). 


The basis of deducing from (ii) that C= a is not a theorem is the consistency of S and the 
method of proof by contradiction; in fact ‘e, # e, isa theorem in IND(S) in this case. The 
‘basis of deducing from (i) that CoS e. is a theorem in IND(S) is the completeness of the 
specifications: For a substitution of all variables in e, and e, by ground terms, the resulting 
ground terms e and e, have the property that either ‘“e = @,” € IND(S) or 


1 2 
“et # &’ € IND(S). 


- 144- 


4.3 Theory of Exceptions Without Nondeterminism 


We now incorporate the exceptional behavior of data types into their theories 
with the assumption that specifications do not specify nondeterministic operations. New 
atomic formulas are introduced to express the exceptional behavior of the operations. We 
describe how the nonlogical axioms of Th(S) can be derived in this case from a 
specification S. We discuss how to construct EQ(S), DS(S), IND(S), and Th(S). New 
‘logical’ axioms characterizing the exceptional behavior of the operations are presented. 
We extend the properties of a specification discussed in the previous section to 
specifications specifying the exceptional behavior. For . illustration, we modify. the 
specification of Set-Int’ so that the operation Choose is required to signal no-element() on 
the empty set; let Set-Int” stand for the modified Set-Int. So, iristead of the Restrictions 
component specifying a precondition for Choese, it specifies a required exception 
condition as follows: 

#(s) = 0 => Choose(s) signals no-element(). 
We also use the specification of Stk-Int. . 

Besides the operation symbols and auxiliary funetion symbols, the language L{S) 
also includes the names of exceptions signalled by the. operations as specified in S. 
Exception terms are constructed as discussed in Chapter 2, using terms and exception 
names. There are two new sets of atomic formulas in addition to equations: 

(a) e signals ext, 
where e is a term, ex? is an exception term, ane every variable i in ext is also in e; and 

(b) ext, = eX, 
where ext, and ext, are exception terms. The predicate ‘signals’ is similar to = but its arity 
is (D U EXV) x EXV. 
> As in the previous section, we first discuss the derivation of the nonlogical axioms 
of Th(S) from S. Then, we discuss the subtheories EQ(S), DS(S), and INDXS), and the full 
theory Th(S). In the last subsection, we extend sufficient completeness, completeness, and 
well definedness properties. | - 7 


- 145 - 


4.3.1 Derivation of Nonlogical Axioms 


The nonlogical axioms of Th(S) are derived from the restrictions and axioms 
components of the specification S in a slightly different way than discussed in 
Subsection 4.2.1. We first discuss the restrictions, and fater the formutas in the axioms 
component. 


4.3.1.1 Restrictions Component 


From a restriction specifying a required exception signalled by an operation o, 
R(X) = o(X) signals ext, _ 

we get the following nonlogical axiom: 

P G(X) = (R; (x) => o(X) signals ext), 
pe the restriction holds only if the input x satisfies the precondition associated with — 

> For example, the restriction on the operation Top i in the specification of Stk- Int, 

~ Empty(s) => Top(s) signals no-top( ), | ; 
is a nonlogical axiom of Th(Stk-Int), as the precondition for ‘Top is T. Similarly, from a 
restriction specifying an optional exception signalled by an operation 0, 

o(X) signals ext =» 00), — 
we get 

P 5X) => (o(X) signals ext => O(%)), 
asa nonlogical axiom. For example, the restriction on Push, 

Push(s, i) signals overflow(s, i) => #(s) > To, 
isa hronlogical axiom of Th(Stk-Int). 


5. Recall that the boolean term R(X) is an abbreviation for the formula R(X) = T. 


+ '14- 


4.3.1.2 Axioms Component 


The preconditions in the restrictions component are also used in constructing the 
nonlogical axioms from the formulas in the axioms component of S.: As discussed in 
Chapter 3, a variable in a formula in the axioms component-¢annot be freely substituted. 
When the exceptional behavior was not considered in Subsection 4.2.1, the substitution was 
conditional: The arguments to every operation in the axiom must satisfy the associated 
precondition. Now, there is an additional requirement: The substitution should not result 
in an operation signalling on its arguments. 

To express the second condition, we introduce a unary auxiliary function 
N?,, : D'U EXV — Bool for every D'€ 4 U {D}u Ay. These auxiliary functions are not 
rf in aspecification. Informally, N? separates a normal value of D’ from an exception: It 
returns T if its argument interprets to a normal value of D:: it returns F if its argument 
signals an exception. Furthermore, Napylole sid .e Di is F if N? ve) is false for any e: 
this constraint on the behavior of N?)y: enables us to get a 1 simpler transformation of the 
restricted formulas in the axioms component of S. . 

Using N2,, , we transform a restricted formula i in the axioms component to an 
unrestricted formula which serves as a nonlogical axiom of TMS). If an equation * e, =e, is 
in the axioms component, where e, and e, are of type D, then the corresponding, 
unrestricted axiom is 

(Ntpfe,) A Nye.) = (PC, A PC) =e), 
where PC, is a conjunction of conditions expressing the constraint that the input to every 
Operation invocation in a term e satisfies the associated. ‘precondition. Similarly, if a 
restricted formula is ‘e, = if b then e,,’ then the corresponding unrestricted formula is 

(N? pool) AN? Ae) A N?,,4e,)) => ((PC, A PC e A PC e) => (b=, =€e,)). 
If a restricted formula is ‘e, = if b then e, else e3,’ then the corresponding unrestricted 


formulas are obtained using the fact that this formula is equivalent to two conditional 


equations 
é= if b then e, 
é= if ~ b then ey. 


-147- 


We illustrate the above transformation on the following equation in the axioms component 
of the specification of Stk-Int: | 
Replace(s, i) = Push(Papis), I. 
The corresponding unrestricted axiom is 
(N2<4.-tn¢(Replace(s, id) A N2ci4..1y¢(Push(Pop(s), i))) = 
(~ Empty(s) => Replace(s, i) = Push(Pop(s), i). 


4.3.1.3 Definition of N2p; 


A specification of D implicitly defines N?) and extends N?,y for every defining 
type D’ of D as well as any auxiliary types D’ used inS. N?,, is defined by the specification 
of D’. Since an operation o has the arity D, x.. xD = s U EXV, and N?,,, has the arity 
D' U EXV — Bool, we need to introduce variables ranging over values ot a type and 
exceptions to characterize N?,,. We have two options: (i) Introduce two kinds of variables 
- variables of a single type D, and variables of a union type D, U EXV, or Gi) introduce only 
variables of a union type. if we adopt the second alternative, the formulas expressing the 
normal behavior of the operations get long because we make the conditional use of the 
variables. Since we would mostly be using formulas expressing normal behavior, we have 
adopted the first alternative. Often, we do not need to have a formula in which both kinds 
of variables are mixed. Except in the axioms for N?,y “and the axioms characterizing the 
general properties of the exceptional behavior of the: data type, we would rarely use 
variables of a union type. Terms as well as exception terms are constructed using only 
variables ranging over a single type (except in the next section). “Henceforth, we use xe, 
XE sre sey XO peony YEr YO e+ +s WE ye ees 20, (ze. » 2e,, etc, as vannnle Ola une type, 
and exy, exv,,...,exv.,... as variables of type EXV. = 

We now discuss the axioms defining. N?),. . First of all, for a variable x OF type D. 
we have the axiom 

N?,,Ax) = T. 

For an operation o, let P(X) be its precondition. Let us assume that the restrictions 
component specifies for o, / required exceptions and m optional exceptions. For each 
1<is</, let R(X) be the condition on input X when o is required to signal an exception; 


= 148-- 


similarly, for each 1 <j < m, let O(%) be the condition when o hasan option to signal. 
For every constructor o of D, we have an axiom defining N? corresponding to.D, 
NXXE) => ((P (XE) A (~ R(XE)A ... A~ RAXE A Me A 
(~ O(XE) A... A~O,(XE)) > NN XE), 
where XE stands for the variables XE sony XE; HE. is a variable af union type DU EXYV, and 
N?(XE) is an abbreviation for N2y (xe) A...0A Nt (ae,). 

The above axiom Bptines the assumption in a specification that if (i) an input toa 
constructor o is normal, (it) the input satisfies the precondition associated with 0, (iii) none 
"of the conditions associated with a required exception for o holds for the input, and (iv) the 
condition an input must satisfy in case o signals an exception specil ified to be optional, also 
does not hold for the input, then o returns a normal value. In other words, ‘this assumption 
states that the exceptional behavior of the operations on their intended inputs must be 
completely specified by the Restrictions component. . 

The extension of the definition of NP: | for every D E A is also captured by a 
similar set of axioms corresponding to every observer 0 € 0 of result type D’. There i is an 
axiom having the above structure corresponding to every observer 6 in Q. | 

. In addition to the above axioms, we have a rule for every operation and auxiliary 
function expressing that if any argument to a function is not normal, then the result of the 
function invocation is also not normal. _ 

Wy Ge) = Fv.. .VNtp (=  F)EN? (olxe = see teD) F. 

Note that there i is no axiom so far which States the condition when. Nip is F. In the next 
subsection on equational ‘subtheory, we introduce a rule characterizog such behavior of 
N?y- . 
_ We use the eile axioms derived from. ‘the restrictions and axioms 
components of S, and the axioms defining Ny 3 along with the additional axioms. and on | 
characterizing the general properties about the exceptional behavior to build various 
‘subsets of Th(S) and finally Th(S) itself. 


- 149- 


4.3.2 Equational Subtheory 


As in case of specifications without nondeterminism .and without exceptional 
behavior, we define the equational subtheory EQ(S) as a set.of atomic formulas. Besides 
equations of the kind discussed in Subsection 4.2.2, we also ave the following atomic 
formulas: 

(a) e signals ext, and 

(b) ext, = exl,. : 
In addition to the rules characterizing = discussed in Subsection4.2.2, we use the 
substitution rule for Vv, and the rules characterizing ‘signals’ and capturing she observable 
equivalence relation on exception values. The substitution, Rule. for. Vv, . wa PE 

(Vv x) (x) => o[x/e], z 
where x is a variable of type D’, and e is aterm of.type:D' and is substitatible for xin © [16], 
is modified to 

(V x) (x) => (N24 {e) = T =>-0[x/e]), - 
since x is a variable ranging over normal values and ¢can signal: an exception. 

_ Rule (i) below says when N?,,. is false, which is if a-term of type-D‘ signals an 
exception, then N?,,. on that term is false. Rule.(éi) states that: ifttwo. terms.are observably 
equivalent and one signals an exception, then the. other. asa signals ‘the:same exception. 
Rule (iii) states that if a term: sighals: two,exceptions; thea the-exceptions gre observably 
equivalent. Rule (iv) states how the observable equivalence relation on exception values bs: 
related to the observable equivalence relations pn -nermal values;. 2. | 

@ xe signals exv- N?, Axe} SF, | 

(ii) xe, = xe, xe, signats exy xe, Signals exy. 

(ili) xe signals exv,, xe Signals exy,  exv, = eXV,, and 
for every exception name éx of arity D, b ea | D. 

(iv) Xi EX oo Xi BX, OMX hime, K, EM Kiss cee X,, ; oer 
It should = Siok that the above rules are sound under the POPOW Ing interpretation: Ina 
type algebra A, for a ground term e and a ground. exception term ext, the: formula’ 
‘e signals ext’ is interpreted as: The interpretation Of ein A is the exception value that is the 
interpretation of ext in. A. For two ground exception terms ext, and ext, the formula 


= 190- 


“ext, = ext,’ is interpreted as: The interpretation of ext, is observably equivalent to the 
interpretation of ext, in EXV of A. 

We now show how to use the above rules along with the nonlogical axioms and 
the axioms and rules defining N?,,,, to prove some properties of data types. Since many 
nonlogical axioms and formulas are conditional having the form 

(7) b= esignals ext, 
where b is a boolean term, we use a trick similar to the one used in Subsection 4.2.2 to deal 
with such formulas so that they can be proved in EQ(S). We ‘introduce an auxiliary. 
function if-then : Bool x EXV x D’ — D’ U EXV. fiaving the behavior defined by the 
following axioms: a . 
if-then(T, ex?, e) signals ext 
if-then(F, ext, e) = e. 
Using the auxiliary function if-then, the formula (7) is equivalent to 
e = if-then(d, ext, e), 
as for an instantiation of the variables in (7), if interprets to F, then (7) is equivalent to 
‘e signals ext.’ The boolean term 5 must not signal. 
As an illustration, we prove from the nonlogical axioms of Stk-Int that 
‘Top(Null) signais no-top()’ € EQ(Stk-Int) in Figure 4.10. emily, ‘we-can prove . 
Top(Pop(Push(Nutt, i))) signals no-top(). . 
Replace(Push(Push(Null, #), 12), re Bua ete it), mY 
However, © 
RestacelPesh™"((Nal 1),.. 100), 0) = saarieeia 1), ..., 100), 0) 
is not derivable because we cannot derive Ng ppt -H5) = T° due to the optional 
exception specified for Push when its stack‘argument-is of size > 100. But we can prove the 


Figure 4.10. Proof of ‘Top(Null) signals no-top()’ 


1. Top{s) = if- then( Empty) no- 100), Top) | Restriction on Top 


. Empty(Null) = T Axiom 4 
3. if-then(Empty(Null), no-top(), Top(Null)) signals no-topQ Axiom of if-then 


4. Top({Null) signals no-topO Substitution in 1, and rule (ii) above 


-1§1- 


following formula: 
Nc tK-int Push’ (Null, 1),..., 101)) => 
Replace(Push"'((Null, 1), ..., 101), 0) se-Pash' (Null, 1), ..., 100), 0). 
The formula 
Pop(Null) = Null 
is not derivable because of the precondition. on Pop. 
It would be interesting to investigate the conditions under which | 
(i) an axiom of the form ‘e signals ex?’ can be treated asa rewrite rule ‘e— ex?’ and the 
_ Knuth-Bendix algorithm be applicable to such’axioms, and 
(ii) a conditional formula involving signals can be rewritten as an equation using the 
if-then and if-then-else operators so that the Kauth-Bendix algorithm is. applicable to 


conditional formulas also. 
4.3.3 Distinguishability Subtheory 


As in case of specifications without sondeterminea and without exceptional 
behavior, DS(S) is defined to be a set consisting of atomic formulas and the negations of 
atomic formulas. DS(S) includes EQS) as well as oranies having the fellowing structure: 


(a) é P? ey, 

(b) ext, £ ext,, and 

(c) e sigdals ext, 
where ‘e sigdals exi’ is an abbreviation for ‘~ (v x,,...,x,) [.¢,signals. ext J: such that 
X,-++» X, are all the variables in the formula ‘e signals ext.’ Besides the axioms and rules 


of inference of DS(S) discussed in Subsection.4:2.3, we have the following. additional 
axioms and rules expressing praperties about the exceptional behavior.of data types which 
enable us to prove formula having the above structure. . 

(v) for every exception name ex : D,x...X D. 

(~ Xj) =%yVee VYX = x,, ~ eX, 15+. a X,) = eX). Sag x,,). 
_(vi) for different exception names ex, : D, X... x D, and ex,: DB) X... x Di in L{S), 
~ ex, (x, or x) = ex,(X,,5 ne x, 
(vii) for a union type D’ U EXV, 


etree es One 


- -152- 


N? Axe, = T, N?, Axe.) =F} ~ xe, = xe, 
where xe, and xe, are of type D’ U EXV, and * 
(viii) N%xe) = T + ~(V exv) xe signals en] 
Rule (v) and axiom (vi) capture the distinguishability relation on exception values. Rule 
(v) is the opposite of rule (iv) given in the previous subsection; it states:that‘two exception 
values having the same name are distinguishable if any of the arguments in one: value ‘is 
distinguishable from the corresponding argumeit in the other value. ‘Axiom (vi) states that 
- two exception values are distinguishable if their exception names are different. Rule (vii) 
states that two values are distinguishable if N?,,, holds for one and ‘does not hold for the 
other. Rule (viii) says that if N?,,, holds for a term, then it cannot signal an-exception. The 
above axiom and rules ate clearly sound: Note that these: rules ear be used to derive 
formulas having the structure ‘~ xe, = xe,,’ which implies that ‘xe. xe, sit 
We can derive from the nonlogical axioms ss aha = pet rule sate that 
(8)  Top(Null) # é ee 
because “Top(Null) signals no-top(),’ N? int = T,, and ‘NY ng(Top(Null)) =a F € 
DS(Stk- Int). The formula 
is inarnediale from hes axiom n (vi) above. Using the theorem (8) i in 1 DS{Stk- Int), we can 
prove by contradiction that 
Null # Push(s, i). 


4. 3. 4 Inductive Subthoory 


_ The inductive ‘subtheory: INXS). can: be icscabrdtanis és:in Subietion’ 4.2.4;-we 
can also use thé dbove atioms and rules characterizitig the exceptional behavior. The 
induction rule (+) in Subsection 4.2.4 hasto be-modified) ‘iastead ofirequirimg that for every 
constructor ground term e of type D, [x/é] be derivable in:the‘preniise; ave only ‘need to 
consider constructor ground terms for which ee =T' ts derivable. So, we have: 

Modified Induction Rule ce a er 
Given a formula (x) with a free variable x:of type D,. 
For every constructor ground term e of type D, N?, ,(e) = Ti =D. puta tH (vx) o(x). - 


- 153 - 


We can use the methods discussed in Subsubsection 4.2.4.3 to establish the infinitely many 
premises, . 
As in Subsubsection 4.2.4.4, if.a specification S specifies nontrivial ee 
on constructors, then the above formula can be simplified to 
for every legal constructor ground term ¢ of type D, N?,,(e) = T = o[x/e] 
F (W x) (x), . 
because of the assumption about the semantics of a constructor on inputs not satisfying the 
associated precondition, discussed in Chapter 3. | 
For example, for Stk-Int, the induction rule is: 
For every legal constructor ground term eof type Stk-Int, 
N%(e) = T => fs/e] + (v3) 05). ae 
The above rule can be simplified using the following theorem i ina way similar to Set-Int’ in . 


the ePIeHOus section: 


Thm. 49 Every legal constructor ground. term e of. type Stk-Int such that | 
‘Ncex-ine(6) = T € EQGtk-Int), is equivalent by. equational reasoning to another legal 
constructor ground term e’ having only Null and Push, Le, if ‘N? Stk- Int). = Te 
EQ(Stk-Int), then ‘e = e'* € EQ(Stk-Int). 


| Proof _ By induction on the number of Pop and Replace in: a constructor ground term e 
using axioms ] and 3 in Figure 4.7. See the details in.Appendix JH. §— 


The simplified induction rule is: 
(9) For every legal constructor ground term e of ‘vps Stk- “Int hashes occurrences of 
Null and Push only, N? Stk- Stk-int(© = T=> ois/e t- (v ) #8). 


4.3.5 The Full Theory 


- The full theory Th(S) is also constructed in a similar way as for data types without 
exceptional behavior. For example, we can prove the normal form theorem using the 
simplified induction rule (9): 

s = NulK) Vv Gs’,7)[s = Push(s’, 7’) |. 


7 14- 


The diagram summarizing the relationships among different subtheories for 
specifications not specifying exceptional behavior on p. 135 also holds in this case. 

For the extended deductive system, the following extension of Theorem 4.2 
holds: 


Thm. 4.10 (i) For any two ground terms éQ and é, of the same type, if ‘62 e, € TKS), 
then e, and e, are observably equivalent by S and if * e, # e, € Tw), then cA and é, are 
distinguishable by S, 

(ii) for a ground term e and a ground exception term ext, if ‘e signals ext € ThS), then 
the interpretation of ein every model A in F{S) is the interpretation of ext in A, 

(iti} for two ground exception terms ext, and ext, if ‘ ext, = ext, € Th(S), then ext, and 
ext, are observably equivalent by S, and if ‘ext, Z ext, € TWS), then ext, and ext, are 
distinguishable by S, and Ms 

(iv) for any ground term e, if ‘N%(e) = T’ € Th(S), then the searenied of e in every 
model A in F{S) is a normal value, and if ‘N%e) = F' € FAS), then the interpretation of e 
in A is either an exception value or undefined. 


Proof The theorem follows from the facts that - 

(a) the nonlogical axioms of Th(S) hold in every model in FS), 

(b) the observable equivalence relation used asthe interpretation of = is a congruence, 

(c) the exceptional behavior of an operation is completely specified by the restrictions 
component of S on inputs satisfying its preconditions, and 

(d) the axioms and rules defining N? and chareereneine the prac ts behavior holds 
in every type algebra. 8 


“We demonstrate how the full theory constructed from a specification S can be 
used to prove properties of programs using the data types specified, by S.. Figure 4.11 is 
another implementation of union procedure using Choose in a CLU-like language. In this 
‘implementation, an element of the first set argument to union is successively selected using 
the operation Choose, removed from the copy of the first argument, and inserted into the 
copy of the second argument until the operation Choose signals no-element, indicating that 
the set is empty. The handler for no-element associated with the lodp is then invoked. In 


- 155 - 


Figure 4.11. Procedure Union - II 


union = proc(sl, s2 : Set-Int”) returns (Set-Int”) 
i: Int : 
rl : Set-Int” := sl 
12 : Set-Int” := s2 
{rl =slAr2=s2} 
while true do 
{ (Size(r]) = 0 = F A IN(Remove(rl, Choose(rl)), Insert(r2, Choosc(r])), s1, s2)) 
V (Size(rl) = 0 = TA Runiomsl. 2) ) 
i:= Sct-Int’$Choose(r]) 
{ IN(Remove(rl, i), Insert(r2, i), s1, s2) } 
rl := Set-Int’$Removetrl, i) 
r2:= Set-Int’$Insert(r2, i) 
{ IN(rl, 12, sl, 32) } 
_ end except when no-clement : 


end 
{ Rusia. s2)} 
return (12) 
{R} 


end union 


IN(r1, 12, sl, s2) = “(W j) [(Has(sl, j) V. Has(s2, j)) <> (Has(r], j) V Has(r2,)) = T)A 
(Size(rl) + Size(r2)) < (Size(sl) + Size(s2)) = T A Size(r2) > 0 = T) 


1/0 Specification for union 
T= R, where R = RI A R2; and 


R1 = (Vi) [ (Has(sl, i) V Has(s2, i))  Has(union(s], s2), i) T} 
- R2 = Size(union(sl, s2)) < Size(sl) + Size(s2) = T 


the code, we have included formulas within ‘{ P that express relations among different 
variables at that point in the code. The Floyd-Hoare inductive assertion ‘method for 
proving properties of programs (17, 36, 55] can be extended to incorporate the exceptional 
behavior of programs. A statement in this case can terminate in more than one way - either 
normally or by signalling an exception. Corresponding to every possible way of 
termination of a statement, we associate an input formula for an output formula. 

Figure 4.1] includes the input-output specification of union. We use the 
following notation for specifying a procedure F(X): Corresponding to every possible 


- 196 - 


outcome of F on an input X, there is a formula relating the input. tothe outcome. Since F 
_ can terminate normally or by signalling an exception, we specify the weakest input 
condition for normal termination, as well as for every exception aut by F, 


TC ay => F(X) signals ext, 


TC ) => F(X) signals ext, 
TC, (0) => RA, 


where TC,(X), . 3). and R are first order formulas, and. r stands for a possible 
result returned a F on ihe input X. ‘TC(X) => F(X) signats ext,” is ‘interpreted as: The 
weakest input condition for F to terminate by signalling” ext is TC (X). 
WM => R(X, 7)’ is similarly interpreted as: The weakest input condition , for F to 

gana normally returning a value r such that R(X, 7 holds is TC, - 00. If F is 
deterministic, then such an 7 is unique for every X; otherwise, there can be many 7's such 
that R(X, 7) holds. Instead of using r as denoting a restilt ‘returned by F on X, we can also 

The formula ‘IN(r1, 12, sl, s2)' is used as an invariant of the loop in the program 
in Figure 4.11. Using the backward substitution semantics of the control structures, we can 
generate the verification conditions and.shew. the. Fequired formulas to bet in: a 
The partial correctness proof of union is complete ifwe can showthat “7 7 

_ ING, 2, sl, s2) => 
( Size(rl) = 0= F A IN(Remove(rl, Choose(+t)), Insert(2, Choose), si, $2) ) 
V Gize(rl) = 0 = ZF A Rumen 2) ) aes 
To prove the above formula, we need the theorem 
Size(rl) »0=T= Size(Remove(rt, Choose(r!))) + l= = = Size(ri). 

The while loop terminates because each time ‘in the ‘oop, Size) is “reduced, and 
Choose(rl) signals no-element when Size(rl) = 0=T. : 

An alternate approach to the Floyd-Hoare method of reasoning about programs 
is to use the first order semantics of control structures 2 as suggested by Cartwright and 


=A57* 


McCarthy [8]. They have shown how reasoning about recursive programs can be 
completely carried in first order logic. The definition of a recursive program can be 
considered as an axiom defining the function computed by the program with an 
appropriate condition on variables.® The termination of such a program can also be proved 
by adding a minimization scheme corresponding to its function. For example, the above 
iterative union program can be transformed to an equivalent recursive program, and the 
axiom characterizing the function computed by the program is derived from the recursive 
program. Th(Set-Int") is enriched by adding this axiom about union and a minimization 
scheme corresponding to union. The input output specification of union can then be 
proved as a theorem in the enriched theory. We use a similar approach in the next chapter 


in showing the correctness of an implementation. 
4.3.6 Properties of a Specification 


It should be clear from the discussion in the previous subsections that the 
following extension of Theorem 4.3 holds: 


Thm. 4.11 For a consistent S, . 

(i) for any ground terms e, and e, of the same type, both ‘e, = e, and e, £e, cannot be 
in Th(S), and | 

(ii) for any two ground exception terms ext, and ext,, both ‘ext, = ext,’ and “ext, # ext,’ 
cannot be in Th(S), and | 

(iii) for any ground term e, both ‘N?(e) = T’ and ‘N%(e) = F’ cannot be in TH(S). 8 


We extend the definitions of sufficient completeness, completeness, and well 
definedness properties discussed in Subsection 4.2.6 to the specifications specifying 
exceptional behavior. The results about these properties in Subsection 4.2.6 directly extend 
when the modified definitions are used. . 


6. The condition is that a variable is instantiated to a value of its type other than -L, which is used to denote 
non-termination. ; 


158 - 


4.3.6.1 Sufficient Completeness 


Recall that the sufficient completeness property as defined in Subsection 4.2.6. 
requires that the behavior of the observers on any intended. input should be deducible by 
equational reasoning. When a specification specifies data types having operations which 

signal exceptions, then the observable behavior of the operations also includes their 
| exceptional behavior. Two values of a data type can also be distinguished in this case. ifa 
sequence of operations signals one exception on one value and does not signal on the other, 
or if the sequence of operations signals different exceptions -on different values. In. the 
extended definition of ‘sufficient completeness, we want to capture the. intuition that in 
addition to the normal behavior of the observers, a sufficient complete specification must. 
also completely specify the exceptional behavior of the operations when their input satisfy 
the associated preconditions. 

If a specification has only required exception conditions for the operations, then 
the above amounts to requiring that 

(i) for any legal ground term e, either ‘N%(e) = T’ €-EQ(S) or ‘N%(e) = F’ € EQ(S), and 
(ii) (a) if ‘N2%(e) = T’ € EQ(S) and e is of type D’ € A, then the condition stated in. Def. 4.6 | 
must be satisfi ed (i.e., there is a ground term e’ not -having any operation symbol of D or 
auxiliary functions used in S such that “e = e” € EQ(S)), and - 
(b) if "N2(e) = F € EQ(S) and for every subterm e, of e, ‘N?)Ae,) = T € FOS), then 
the formula ‘ e signals ex? € EQ(S) for some ground exception term ext. 

IfS specifies optional exceptions also, then there are legal ground terms for which 
neither ‘NX(e) = T nor ‘NXe) = F is provable. For example, we can nee prove 

Nt q((To( Push! (Nal 1),..., 101) =T a 
nor 

‘NN? ate 1 oP((Push'?'(((Nult, 1),..., 101) =F . 
from the specification of Stk-Int. For such a specification, the definition of sufficient 
completeness’ must include the condition that for such a ground term, if we assume 
‘N?, {e) = T,” then ‘e=e'" is derivable using equational reasoning.” This condition is _ 
based on an aspect of the semantics of a specification, namely that if an operation does not 
signal on an input for which it had the option to signal, then the formulas in the axioms 


2159 = 


component for the operation behavior must hold. — 


Def. 4.9 A specification S is sufficiently complete if and only if | 

(i) for every e of type D’ € A, if ‘N2%(e) = T’ € EQ(S), then there is.a theorem ‘e = e’ € 
EQ(S) for some e’, a ground term of type D’ not having any. operation symbol of D and 
auxiliary function in S, 

(ii) for every e (= ofe,,..., €)) of type D’ €eau{ D }., if ‘N%(e) = F € EQ(S), and 
‘(N2(e, )A ... AN%Ee )) = T € EQS), then there is a theorem ‘e signals ext € EQS) for 
some ide exception term ext, and : , 

(iii) for every legal ground term e of type D’ € A U { D }, if neither NX6 = T € EQS) 
nor ‘N2%(e) = F’ € EQ(S), then there exists.a.subterm.e,.of.¢ such that.e, = ole Cp cacais ts) 
and ‘Olx,/e,, ..., Xe, ] = T € EQ(S), where o is specified to optionally signal if its 
input satisfies O(x,,..., x,), and assuming ‘N?(e).= T,’ there is'a theorem ‘e = e'’ € 
EQ(S U { N%e) = T }), where eis aground term olere D‘ having no operation sabe ‘of 
D and auxiliary function used in S. a 


Su{f} stands for the nbaIGeieal axioms derived from s plus the formula f, and 
EQ(S U { f }) stands for the equational subtheory derived using, S U { f} as the. nonlogical 
axioms. The condition (iii) above amounts to proving ‘the theorem assuming NX) = T.’ 

= 8 For example, Stk-Int is sufficiently complete. Top(Nutt) signals no-top() €E 
EQ(S). Assuming ‘N?,, (Top(Push" (Null, 1),..-, 101))) = T, we can derive 
‘Top(Push’'((Null, 1), ..., 101)) = 101° in EQ(S). 

The specification of Set-Int” is not sufficiently complete, because, for instance, 
though ‘N?, ,(Choose(Insert(Insert(Null, 0), 1))) = T’ € EQ(S), there does not exist any 
ground term e of type Int not having any operation symbol of Set-Int" such that 
‘Choose(Insert(Insert(Null, 0), 1)) = e” € EQ(S). 

The results discussed about specifications not specifying exceptional behavior in 
Subsection 4.2.6 directly extend to specifications specifying exceptional behavior when 
appropriately modified. We have a | 


7 1600- 


Thm. 4.12 If S is sufficiently complete, then S is behaviorally complete. - 
Proof See APPEOE lif. 8 | 


The obvious analog to Theorem 4.5 also holds; its converse is a coneemite analogous to 
Conjecture 4.1. We also have 


Thm. 4.13. For a consistent and sufficiently complete S, if any two legal ground terms e, 
and e, of type D are distinguishable by S, then ‘e, #e,€ DS(S). 


Proof See Appendix III. & 


4.3.6.2 Completeness and Well Definedness — 


The completeness property of a specification. can be defined in: this ‘case in the 
Same way as in Subsectien 4.2.6. Def. 4.7. in: Subsection 4.2.6 works for this. case also. 
Theorem 4.7 for this case can be proved in the same way: as -for specifications without 
exceptional behavior. It can be shown that the specification of Stk- Int i is complete, wherc as 
the specification of Set- Int" is not complete. | 
The well definedness property is ‘also defined in the same way as in case . of 
specifications without exceptional behavior. Def. 4. 8 i in Subsection 426i is valid. It can be 
_ shown that the specifi cations of Set- Int” and StkeInt are well defined. ; 


- 161 - 


4.4 Theory of Nondeterminism 


In this section, we discuss specifications specifying nondeterministic operations. 
Again, we first discuss specifications without exceptional behavior; later, we incorporate 
the exceptional behavior also, For the first part, we modify the specification of Set-Int' 
given in Figure 4.1 so that the operation Choose is specified to.be nondeterministic. Let 
Set-Int’’ stand for the modified specification. In the second part, we use the specification 
of Set-Int given in Figure 3.1. | "8 . 

We find it convenient to express properties of a data type with nondeterministic 
operations as formulas using nondeterministic operation symbols (which is also. the reason 
to allow a specification to have such formulas in the axioms: component), but such a 
formula must be interpreted properly. A nondeterministic function symbol does not have 
the substitution property with respect to = unless interpreted properly. We discussed this 
in the previous chapter; we will repeat the discussion here.. For example, the formula 
‘Choose(s) € s = T in the specification is to. be interpreted as any integer returned by 
Choose on the argument s is in the set s, The formula 

s] = s2 => Choose(sl) = Choose(s2) 
need not hold if ‘Choose(s]) = Choose(s2)' is interpreted as an integer returned by Choose 
on sl is the same as an integer returned by Choose on s2, because different invocations of 
Choose on the same argument may return different integers. However, if we interpret 
*‘Choose(sl1) = Choose(s2)’ as for every possible integer returned by Choose on sl, Choose 
on s2 can return the same integer, and vice versa, then the formula 

sl] = s2 => Choose(s]1) = Choose(s2) 
holds. We adopt the latter interpretation, so that the substitution property continues to. 
hold.’ The adopted interpretation is consistent with the definition of observable 
- equivalence on ground terms involving nondeterministic operations induced by S, given in 
Sections 2.2 and 2.3. | 


7. As is discussed in the previous chapter, the reason for rejecting the former interpretation is that the 
formula ‘o(X,,..., x,) = o(X),..., X,) fora nondcterministic symbol o is almost atways false under it. 


- 162-. 


We cannot however express many interesting properties ‘about: a data type | 
because in a formula involving a nondeterministic operation symbol o, different 
occurrences of a term ofe,,..., e,) may result in different values. We often need to express 
properties in which different occurrences of the term o(e,,.., €.) stand for the same valiie. 
For example, consider another version of the union procedure given in Figure4.12, which 
is a slight modification of the version given in Figure'4.11. In this case, the while loop has 
the condition “~ (4#(s) = 0), instead of ‘true” it Figure 4.11. In verifying this. version of 
union, we must use the properties of i, a result returned by Choose: In such‘a case, we .. 
‘antroduce an auxiliary function op: D, x...x D'x Ds Bool corresponding to the 
riondeterministic operation o, which is the relation describing the behavior of o. 

(*) o(x,....%,» Q(T © ifocan return yas'a possible result on 5 rere, 
| : . otherwise ee — 
For example, we introduce me for Choose. and use Choose_p to express a property of 
i, a result returned by Choose. . 

Since formutas in the axioms component of é are expressed using 
nondeterministic operation symbols, we transform them to equivalent formulas having orily 
deterministic symbols using the auxiliary functions corresponding to the nondeterministic 
symbols. We discuss the transformation procedure TR below. 1(S) now also includes the 
auxiliary function o_p corresponding to every nondeterministic operation symbol o. ‘The’ 
transformed formulas have a restricted interpretation just as the original formulas in’ the 
axioms component, so we derive unrestricted formulas from the transformed forrielle i 
using the method discussed in Section 4:2 for specifications with’ deterministic operation 
The precondition specified by a nondeterministic operation a is taken as the precondition | 
for the corresponding auxiliary function op. So in the specification of Set-Int, 
‘~ #(s) = 0° is the precondition for Choose_p. The unrestricted formulas serve as the 
nontogical axioms of S. To prove a ‘formula /f involving nondeterministic operation 
symbols, we first transform fusing TR, and then prove TR(/) from the ‘nonlogical ‘axioms 
ofS. . 


The transformation procedure TR must embed the semantics of S assumed in 
Chapter 3. Recall that the semantics of S only requires that for every data type in D(S), the. 


- 163 - 


semantics of S, an operation specified to-be nondeterministic must return an appropriate 
value on every input; the operation in every data type in D(S) need not have the maximum 


amount of nondeterminism specified by S. 
4.4.1 Transformation Procedure TR 


We first describe the procedure TR and later verify that TR(/) is semantically 
equivalent to f Before describing the transformation procedure, we illustrate it using 
examples. Consider the following formula in the axioms component of Set-Int'": 

Choose(s) € s = T | 


Figure 4.12. Procedure Union - Ill 


union = proc(s],s2 : Set-Int’’) returns (Set-Int”)  ~ 
i: Int 
rl : Set-Int’” := sl 
r2 : Set-Int’” := s2 
{rl =slAr2=32} 
while ~ Sct-Int’’$Size(rl) = 0 do 
{ Choose_p(rl, i) = T A IN(Remove(rl, i), Insert(r2, i), sl, s2) } 
i: = Set-Int’$Choose(rl) “ 
{ IN(Remove(rl, i), Insert(r2, i), s1, s2) } 
rl := Set-Int’’$Remove(rl, i) 
r2 := Set-Int’’$Insert(r2, i) 
{ IN(rl, 12, sl, s2) } 
end 
{ Renal s2) } 
return (r2) 
{R} 


end union 


IN(r1, 12, sl, s2) = (¥ j) [ (Has(sl, j) V Has(s2, j)) = (Has(rl, j) V Has(r2, j) = TJA 
(Size(rl) + Size(r2)) < (Size(sl} + Size(s2)) = T A Sizc(r2)> 0 = T 


I/O Specification for union 

T => R,. where R = R1 A R2, and 
R1 = (V i) { (Has(sI, i) V Has(s2, i)) = Has(union(sl, s2), i) = T } 
R2 = Size{union(sl, s2)) < Size(s]) + Size(s2) = T 


 - 164- 


The above formula states that every value returned by Choose is in the set s. The 
transformed formula obtained after applying the procedure would be 

((v i) [ Choose_p(s, i) = T = i € s = T] A G i) Choose_p(s, i) = T) 
The second conjunct states that Choose returns at least one value on every input. The 
unrestricted formula, which serves a nonlogical axiom ‘of Set-In ae , is obtained using the 
precondition for Choose; it is given below: | 

((v i) [~ #(s) = 0=T=> (Choose_p(s, i) = T= i€s=T)JA 

(3 )I~ #(s) = 0 = T > Choose_p(s, i) = T) 

Let us consider another formula ‘ Choose(sl) = Choose(s2).’ This states that for every 
value returned by Choose on sl, there is an observably equivalent value returned by 
Choose on s2, and vice versa. TR transforms this formula to 

~ ((v il) [Choose_p(st, it) = T => (3 i2) [ Choose_p(s2, i2) A il = i2}] A 

(v i2) [ Choose_p(s2, i2) = T => (2 il) [ Choose_p(sl, il) A il = i2]])- 
We now present the transformation procedure TR, which is defined inductively 

making use of the structure of a formula, 

Basis fis an atomic formula e, = e. | 

(a) fdoes not have any occurrence of a nondeterministic operation symbol: 

TRA) 2 fF | 

(b) both e, and e, have occurrences of igudeterniaiae: peratia symbols: 

We wish TR(f) to roughly express that for every instancé ‘of the free’ variables i in f, for 
every possible choice made about the invocations of the nondeterministic: operation 
symbols in ep there are choices for the invocations of the nondeterministic operation 
symbols in e, such that the instantiations of e, and e, return equivalent results, and vice 


versa. 
TR(e, = e,?) has the following structure: 
(Vz,,..Z)[¢, => Gy)... ys) [c,.Aese]]A 
(Vy)... ¥)[c,= (3z,...,z)[¢,Ae =e]], 
where z,,..., Z, are new variables such that corresponding to each occurrence of a 
nondeterministic operation symbol ¢ in e,, say the occurrence ‘of€,, ink »é), there is a 
variable z, to stand for the possible result returned by.¢.on its. input. The formula-c, is a 


- 165 - 


conjunction of the equations of the form ‘owe, ere a z) = T’, stating conditions on z.. 
Similarly for e, new variables VyrveesYy ave introduced, and c, is obtained from e,. e, and 
e, are obtained from e, and e, respectively, by substituting z,,...,z, and y,,..., yy for 
subterms having nondeterministic operations. as the outermost operation in e, and e, 
respectively. We discuss later how c, and e; are constructed from e,, and c, and e, are 
obtained from ey. 

(c). only one side of the equation ‘6= e, has occurrences of nondeterministic operation 
symbols. Without any loss of generality, we assume that only the Lh.s. has occurrences of. 
nondeterministic symbols. . 

Construct c, and e, from e, as discussed above. Then, 
TR(e, = e,) = (WV z,,. »Z)[c => ese)AGz,..., z)¢, 
This completes the basis step of the definition of TR. The second conjunct is to ensure ¢ that: 


there is at least one value returned by e. 


Inductive Step 
Since all other logical symbols can be apes in terms of A, V and. Vv, we define 
how TR works on formulas having these symbols. 
(a) if fis ~ f, then TR(/) = ~ TRA) 
(b) if fis f Af, then TR’) = TRA) A TRY) 
(c) if fis (v x) f, then TR’) = (v x) TR(). 
This completes the definition of TR. 
For instance, a conditional equation ‘b = e, = e,, where 5 is a boolean term, is 
transformed to 
b=> TRC eze »)s 
if b does not have any nondeterministic operation sjanbiiles If b has ndaldehetmiiniic 
symbols, then the conditional equation is transformed to 
TR(‘b = T) = TRCe, = e,’) 
=((vz,...,.2z) [c= FSTIAGz’,...,7)0) > TRe, = e,). 
Since such a b is assumed to behave deterministically (See Section 3.1), i.e., for an 
instantiation of the free variables X in the conditional equation, b interprets either to T or 


to F, the above formula agrees with the interpretation of a conditional equation assumed in 


- 166 - 


Section 3.2 on the semantics of a specification. 
. We now describe how to construct c and e’ from_a term e by induction on the 
number of occurrences of nondeterministic operation symbols in e. Let k stand for the 
number of occurrences of nondeterministic operation symbols in e. 

Basis k=1— 

Let e = ofe},..., €.) be the subterm of e having the nondeterministic operation o as 

its outermost operation symbol. Then c is ‘o_p(e, ;..., €', 2) ='T” and e’ is obtained abi 

replacing e' in e by z,. The type of z, is the range type of «. 


Inductive Step Assume c and eé’ can be constructed if e has K <k occurrences s of 

nondeterministic symbols. Show for k. 
| (i) If e has the subterm having k occurrences of nondeterministic operation symbols, 

et the subterm be e! = o(é,,.. .e), where o isa nondeterministic operation symbol. 
Each é has less than k occurrences of nondeterministic operation symbols. By the 
inductive step, let c,,..., ¢, be the formulas obtained by applying this procedure ‘on 
é,..., @ respectively, and let e',..., ¢’ be the terms obtained by replacing subterms 
having nondeterministic operation. symbots by -Aew ‘variables. in G5 605 @ respectively. 
Then ee ee, OES 

c=ope,....6,.2)2TACA...Ac, . 

and é is obtained by replacing e' in e by z 

(ii) There is no such subterm of e. Consider all outermost subterms of e having’ a 
nondeterministic operation symbol as their outerinost ‘opéhition; ‘let them be ¢,,..., €.. 
Each of these subterms has less than k number of occurrences of nonideterminitic 
operation symbols. By inductive step, let Cynon ie be thé formala’ obtained by 
transforming ¢,,..., e, respectively, and let e:,..., e ‘be the terms obtained by replacing 
subterms having nondeterministic’ operation’ eee by new vanes iN @,..-, @, 
respectively. Then 
| c= A...Ace, 
and ¢’ is obtained by replacing e,,...,e bye! ,...,e" respectively. 

This completes the discussion about how:c and é are obtained ‘from e. 


- 167 - 


Thm. 4.14 fand TR(/) are semantically equivalent. 


Proof See Appendix Ill. § 


4.4.2 Th(S) 


The nonlogical axioms obtained as discussed above are used to prove properties 
about the data type. A nonlogical axiom involves existential quantifiers in contrast to a 
nonlogical axiom of a specification specifying only deterministic operations. So, the whole 
machinery of first order predicate calculus is needed to prove an arbitrary equation or an 
inequality involving nondeterministic symbols. So it is not meaningful to discuss the 
subtheories EQ(S), DS(S), and IND(S); we instead discuss the full theory Th(S). The 
formulas are proved in the same way as in case of specifications specifying deterministic 
operations only. | 

As an illustration of the use of Th(S), we verify the version of the procedure union 


given in Figure 4.12. Note that the backward substitution semantics of the assignment 


statement 
i:= Set-Int$Choose(r1) 
is given as . 
{ Choose_p(rl, i’) = T A Ph, } i:= Set-Int$Choose(r!) { P }, 
gratin of . 


{P Choose(r1) } i: = Set-Int$Choose(rt) { P }, — 
because different o occurrences of the expression Choose(rl) conn possibly return different 
results. For example, the. verification condition 
{ IN(Remove(rl, Choose(r!)), Insert(r2, Choose(r1)),.s!, 2)} 
i:= Set-Int$Choose(rl1) { IN(Remove(rl, i), Insert(¢2, i), sl, s2) } 
is not true, where as 
{ Choose_p(rl, 7) = T A IN(Remove(rl, i’), Insert(r2, i’), sl, s2) } 
i:= Set-Int$Choose(r!) { IN(Remove(r1, i), Insert(r2, i), sl, s2) } 
is true. In this case also, ‘IN(rl, r2, sl, s2)’ serves as an invariant of the loop. Using the 


backward substitution semantics of the control structures, we can generate the verification 


= 168- 


conditions and show the required formulas to be in Th(Set-lat’’), The partial correctness 
proof of union is complete if we can show that 

( ~ Size(r1) = 0 = TA IN(rI, r2, sl,s2)) => 

(Choose_p(rl, i) = T A IN(Remove(rl, i), Insert(r2, i), sl, 2) 
To prove the above formula, we need the theorem 
Size(rl) >0 = T => Size(Remove(r!, Choose(r1))) + l= Size(rl). 

The termination is also ensured because: each time in the loop, Size(rl) i is reduced, so the 
Joop condition will eventually become false. 

We think that many properties of nondeterministic operations expressed as 
equations and inequalities can be derived from. the untransformed nonlogical axioms (the 
nonlogical axioms. obtained from the formulas in the Axioms component of the 
specification before: applying TR) using techniques employed for deterministic operations, 
for instance, viewing equations as rewrite rules and using Knuth- -Bendix algorithm for 
deriving properties. We have not investigated the. extent to which this can be done. This 
hypothesis is another reason for preferring. to write specifi cations directly _using 
nondeterministic operation symbols as compared to writing them indirectly using the 


relations corresponding to nondeterministic operations. 
4.4.3 Data Types with Exceptional Behavior 


We discuss the modifications required to meomporate the exceptional behavior 
specified by the specifications with nondeterthintstic: operations. We describe. how to 
derive the nonlogical axidms from’ a spécification:: We use the original Specification of 
Set-Int given in Figure 3.1 for illustration; the:spet+ication %s repeated in Figure 4.13. 

As before, an auxiliary function: o:p is associated with every. hondeterministic 
operation symbole: o_p is not strict with respect-to its last'argument:: 


-169- 


Figure 4.13. Specification of Set-Int 
Operations 


Null : — Set-Int . as B 
insert : Set-Int X Int — Set-int 7 
Remove : Set-int X Int — Set-int iy 


Has _: Set-Int X Int + Bool as x, € x, ‘o 
Size : Set-Int — Int as #(x,) 
Choose : Set-int — Int nondeterministic 


— no-element0) 

| Restrictions 

#(s) = 0 => Choose(s) signals no-element 
Axioms 


Remove(Z, i) = @ - rat & ag’ 8 
Remove(insert(s, i1), i2) = if i1 = i2 then Remove(s, i1) else Insert(Remove(s, i2), 11) 
i€@=F o* 
i1 € Insert(s, i2) = if i1 = i2thenTelseit€s — 

#(@) =O 

#(Insert(s, i)) = ifi€ sthen #(s) else #(s) +1 _ 

Choose(s)€s = T 


o:D,X... xD — D'U EXV 
op: D, Xx sie x D, x.(D' U EXV) — Bool, . 


OAK, .-..%, 22) 2(T — if N%(ze) = T and o can return ze 
as a. possible result on x,,....%,) 
tT. if N2(ze). = F.and a signals zeon x,,.... 
Recall that ze is of union type. 


We extend the transformation procedure ‘FR discussed in the previous subsection. 
Besides equations, we have two additional. kinds. of atomic formulas: ‘e signals ex’ and 
“ext, = exl,’. TR for equations is same as in the previous subsection except that the new 
variables introduced in the transformation are of union type. 


-170- 


An exception name is treated like a deterministic operation symbol, so 
‘ext, = = ext,” is treated like an equation ‘ e= é,. TR is extended to treat ‘e signals ex/’. as 
e= ext. TR is applied on ‘e = ext.’ In the transformed. formula, a subformula of the form 
‘ é = ext’ wherever ex? is an exception term and é’ is a-non-variable term, is replaced by 
the subformula ‘e’ signals ex?.’ Note that a transformed formula may involve terms 
constructed using variables ranging over union types. a . 
The restrictions on a nondeterministic operation o are transformed to get the 
nonlogical axioms as follows: A restriction specifying a required exception for o, 
R(X) => o(X) signals ext, 
is transformed to 
P(X) => ( R(X) => o_p(X, ext) = T). 
For example, from the restriction on Choose, 
#(s) = 0 => Choose(s) signals no-element(), 
we get 
#(s) = 0 => Choose_p(s, no-element()) = T. 
A restriction specifying an optional exception for o, 
o(X) signals ext > O(%), 
is transformed to 
“p 5(X) => (o_AX, ex) = T=0()= T). 
Axioms defining N?,, are constructed the same way as ‘for the specification with 
deterministic operations except that there is no akiom due fe a nondetefministic operation 
o because the ‘rarige of the corresponding auxiliary functian o_p is Bool and not 
Bool U EXV. In addition to the axioms and rules expressing general properties of the 
exceptional behavior of the operations discussed in the previous sections, we have another 
rule. Recall that a nondeterministic operation can either signal an exception or has the 
choice to return one of many possible normal values. An operation ‘does not have the 
choice between returning a normal value and signalling an exception on the same input. 
This property is captured by the following axiom for every nondeterministic operation o: 
~ (Gi ze) [o_AX, ze) = TA NX ze) = TIA G ze) {o:AX, ze) = TA NXze) = F /). 
From the formulas in the axioms component of S, the nonlogical axioms are 


-T7l- 


derived as follows: We apply TR on a restricted formula to replace nondeterministic 
operation symbol by the corresponding auxiliary functions. Since the restricted formula 
expresses the normal behavior of the operations, the new variables introduced in the 
transformation range only on normal values. So, we use variables of a single type instead of 
the union type. For instance, for an equation ‘e, = e, having nondeterministic operations 
on both side, we get 
(V.z,.Z)[¢, => (Ay)... Le, Ae =ej]A 
(Vy,....y)[e>G Soaee ne =¢,)}. 
To get the corresponding unrestricted formula incorporating the exceptional behavior of 
the operations and the preconditions, we must require that 
()'N2,(e) = T and ‘N?, (e’) = T’ hold, and | 
(ii) every operation invocation in the formula must satisfy the associated precondition. 
The unrestricted formula for the above restricted formula is 
(V Zino 2) EN2 Ae) = (PC, = (c, > 
—GY,;.. I DIN% ye) = (PC, APC, a PC, = (C, Ae =e, yD 
ced bi otic = (c, => a , 
(3z,.-.52 JIN? Ae) => (CPC, APC, A PQ) = 6 Aec=e MDI 
A similar transformation can be obtained for a restricted formula of the form - 
‘e, = if bthene,.’ 
For example, the formula 
Choose(s) € s = T 
in the specification of Set-Int is transformed first to the restricted formula using TR, 
((v i) [Choose_p(s, i) = T= i€s=TI]AQG iat i) = TD. 
and later to 
W DIN? coil € 5) = Ts (Choose_p(s, i) = T > (N?,,,(T) = T = i€s = T))} 
A (ai) [Choose_p(s, i) = T Dp, 
which gets simplified to 
((v i) [Choose. p(s, i) = T = 1€ s = T] A (2) [Choose_p(s, i) = TD, 
because ‘N?,, {i € s) = T’ and ‘N?, (1) = Tare derivable. — 
Figure 4.14 is yet another implementation of union using the nondeterministc 


_ -12- 


operation Choose which signals on the empty set. This version is similar to the version 
given in Figure 4.11 except that Choose is aondetermiaistic. It can also be verified using 
the properties in Th(Set-Int). 


Figure 4.14. Procedure Union-IV 


union = proc(sl, s2 : Set-Int) returns (Set-Int). . 
i: Int 
rl : Set-Int := 1 
12: Set-Int := 2 
' {ri asl Ar2 = 92} 
while trae do 
{ (Size(t) = O= FA Choose_p(rl, i)=T A INRemove(tl i, Insert, ’), sl, $2) 
V (Size(rl) = Oa TA Ron! 2) )} 
i: = Set-Int$Choose(rl) 
{ IN(Remove(l, i), Insert(r2, i), 7 2)} 
rl := Sct-lat$Removefel, i) : 
12. := Set-Int$Insert(r2, i) 
{ IN(rI, 12, sl, s2) } 
end except when no-element : 


; end 
{ R,unionl, 2) 
return (12) 
{R} 


end union 


IN(r1, 12, sl, s2) = ((¥ j) [ (Has(sl, j) V Has(s2, j)):< (Has(rl, j) V Has{r2, )) = T] A: 
(Size(rl) + Size(r2)) < (Sizefsl) + Size(s2)) = T A ee) > 0 = ue 


Vo Specification for union 
T => R, where R= R1 A R2: and © 


R1 = (V i) {(Has(sl, i) V Has(s2, i)) = Has{union(sl, 32), i) 7] 
R2 = Size(union(sl, s2)) <$ Size(sl) + Size(s2) = £ 


-173- 


4.4.4 Properties of a Specification 


We can prove theorems analogous to Theorems 4.10 and 4.11 for specifications 
specifying nondeterministic operations and exceptional behavior, demonstrating the 
soundness of the axioms capturing general properties of data types. . 

The definition of sufficient completeness ‘property has to be modified 
significantly, because there is no meaningful definition of the equational subtheory for 
such specifications. Because of the semantics of S-as defined in Section 3.2, it does not help 
to consider only the formulas involving deterministic operations and the auxiliary functions 
corresponding to nondeterministic operation symbols. Recall that for a behaviorally 
complete specification, for every input X to a nondeterministic operation, the 
corresponding auxiliary function is required to hold for at least one (X, ze), where ze is a 
possible result returned by o on X, and the axioms do not precisely specify the values on 
which the auxiliary function holds. This incompleteness is because the semantics of S does 
not constrain an operation specified to be nondeterministic to have any fixed amount of 
nondeterminism (see Section 3.2). 

A plausible modification to the definition of sufficient completeness is to require 
it to use the whole machinery of first order predicate calculus for deduction. Instead of 
requiring a theorem to be in EQ(S), we require it to be in Th(S). In addition, the definition 
_ of sufficient completeness given in Subsection 4.3.6 must also be modified to deal with the 
case when a legal ground term e involves nondeterministic operation symbols. For e of 
type D’ € A, if ‘N?,{e)'= T € Th(S), it cannot usually be proved equivalent to a ground 
term of type D’ having no operation symbol of D, as_ in case of 
Choose(Insert(Insert(Null, 1), 2)) for example. Instead we must prove that there exists a set 
of ground terms { ¢,,..., ¢ } of type D’ not having any operation symbol of D such that 

(3z,-...z)[cA(eé= eVéezev...Veze)h, 
where c is the condition on z,,..., 2, generated due to e when we apply the procedure TR, 
and e’ is the term obtained from e by substituting z,..., z_, for the subterms having 
nondeterministic operation symbols as their outermost operation. {e,;...,e } consists of 
all possible outcomes of e. (Since it is assumed that ‘N?, (e) =T € Th(S), z,,..., 2, are of 
a single’ type instead of a union type.) | For example, in case of 


-174- 


Choose(Insert(Insert(Null, 1), 2)), we can show that 
(3 i) [ Choose_p(Insert(Insert(Null, 1), 2),,) = TA (i= 1 Viz2)] 

We have not investigated the relationship between the above definition of 
sufficient completeness and the behavioral completeness property for such specifications. 
We conjecture that most of the results (Theorems.4.12, and 4.13 in particular) of 
Subsection 4.3.6, when appropriately modified, would hold for such specifications also. | 

The definition of well definedness given in Subsection 4.2.6 directly extends to 
this case also. The definition of completeness, like the definition of sufficient, 
completeness, must require in this case that for any two legal ground terms e, and e, of the 
same type, ‘e, = e, € ThS) if and only if e, and e, are observably equivalent, The 
definition 4.8 of well definedness given in Subsection 4.2.6 is valid in this case also. 


21S = 


4.5 Strong Equivalence of Specifications 


In Subsection 3.2.6, we defined the equivalence on specifications; the definition 
required two equivalent specifications to have the same semantics. As discussed in 
Subsection 4.2.6, two equivalent specifications can be different in what properties of a data 
type (a set of data types) can be deduced from ‘them. Below, we define a stronger 
equivalence relation on specifications, which not only requires that the two specifications. 
have the same semantics, but also that the same properties can be deduced from the 
specifications. | 
Def. 4.10 Two specifications S, and S, are strongly equivalent if and only if assuming that 
for every type used in S , and S,, we use the same theory, 

(i) S, and S, are equivalent, ie., D(S,) = D(S,), and 
(ii) Th(S I L(D) = ThHS,)| Ly § 


If S, (or S,) specifies a nondeterministic operation o, we assume that L{D) includes the 
corresponding auxiliary function o_p in place of o. | | 


- 176 - 


5. Correctness of Implementation 


One of the main purposes of designing a specification of a data type is to have a 
standard that can be used to verify whether an alleged implementation of the data type is 
correct. In this chapter, we propose a correctness criterion for an implementation of a data 
type with respect to its specification, and discuss a method embodying the proposed 
‘correctness criterion. In this process, we also exhibit how the theory of a data type 
discussed in the previous chapter is used. ; 

An implementation of a data type D is concerned with how to realize the behavior 
of D, in contrast to its specification where the main concern is to precisely state its behavior. 
Intuitively speaking, our correctness criterion is that a correct implementation with respect 
to a specification must have the same observable behavior as prescribed by the 
specification. 

Our approach for proving correctness of an implementation is similar to that of 
Hoare [37], Zilles [76] and Guttag et al. [29], and is radically different from the ADJ group’s 
approach [23]. We separate the correctness method from the semantics of the host 
programming language in which an implementation is coded. We do not wish to concern 
ourselves with the issue of semantics of the control structures in the programming 
language, so we assume that the semantics of the procedures implementing the operations 
of D is already derived from their code. In contrast, the ADJ group does not seem to 
~ separate the correctness method from the semantics of the host programming language. It 
seems to be incorporating the semantics of the control structures used in implementing the 
operations into the correctness method, for instance, see their definition of deriver, which is 
a morphism from the specification algebra to the implementation algebra [23]. This makes 
its approach complex and restrictive. 

An implementation uses data types abstractly; it does not refer to any particular 
implementation of a data type used in it. A recursive implementation of a data type D is an 
exception because a reference to D in the recursive implementation is interpreted as the 
reference to the implementation itself. We discuss recursive implementations later in the 


chapter; until then, we assume that an implementation of a data type does not use the data 


-177- 


type itself. For the time being, we also rule out mutually recursive implementations of a 
collection of (recursive or non-recursive) data types in which an implementation I of a data 
type D uses a data type D’ and an implementation I' of D' uses D. We discuss mutually 
recursive implementations later with recursive implementations. : 

While deriving the semantics of the. procedures implementing the operations of D 
in an implementation I, we do not use the semantics of any particular implementation of a 
data type D' used.in I. We instead use the theory constructed from the specification S' of 
D’, abstracting from all correct implementations of D' with respect to S'. The. proof of 
"correctness of an implementation of D thus does not depend on any property specific of a 
particular implementation of D’.. It remains valid. even when an implementation of D’ is 
modified or replaced, as long as the new implementation of D’ is correct with respect to the 
specification of D'. This separation of the proof of use from the proof of implementation 
hierarchically structures the correctness proof, reducing the complexity of the verification 
process [37]. ; ; G88 Fe ; 
In the first section, we discuss the correctness criterion and present an overview of 
different steps in the correctness method. In the:.second section, we. discuss the 
implementation ‘structure and the. semantics of an: impJementation.. In the third section, we 
describe in detail the method for proving correctness of an:implementation with respect to 
a specification. In-the fourth section, we. discuss extensions to the proposed method for 
proving correctness of recursive and mutually recursive implementations. : 


- 178 - 


5.1 Correctness Criterion and Overview of Correctness Method 


As discussed in Chapter 3, a specification S in general specifies a set D(S) of 
related data types, because the behavior of some of the operations is intentionally left 
unspecified on certain inputs. In an implementation, the behavior of the procedures 
implementing these operations must be defined on all-inputs in their domains, because. an 
implementation in most programming languages realizes a single data type.! The designer 
of an implementation must pick one data type from the set D€S) of data types. 

If a specification specifies preconditions for the operations, the designer of an 
implementation has the freedom to decide what the procedure implementing such an 
operation should do on an input not satisfying.its precondition. This.is because in defining 
the semantics of a specification, it is assumed to be the: user's responsibility to ensure that 
the input to the procedure satisfies the specified precondition. If a precondition is specified 
for constructor, the procedure implementing the constructor could either signal or return a 
value of the: defined type. However, the value returned must: be constructible by a 
procedure implementing a constructor using: inputs satisfying its precondition (see 
discussion on p. 89. for the elaboration of this assumption). : If:a precondition is specified 
for an observer,'‘the procedure implementing the observer could return a value of its range 
_ type, or signal. For example, the-operations: Pop and Replace: of Stk-Int are specified to 
have ‘~ (Empty(s) 2 0)" asthe precondition. An implenientation of Stk-int could have, for 
example, the procedure implementing the constructor Pop either signal on an empty stack 
or return an arbitrary stack. 

For an operation specified to optionally signal exceptions, if the input to the 
procedure implementing the operation satisfies the associated condition, the designer has a 
choice between signalling the specified exception and returning a normal result that 
Satisfies the axioms. For example, if optional exceptions are used to specify the size 
requirement on the values of a data type, as in case of Stk- Int, an implementation must 


decide the maximum size of the values. The procedure implementing the constructor Push 


1. We are not considcring parameterized implementations, 


- 179 - 


in an implementation of Stk-Int could either signal overflow or return a stack constructed 
by pushing the integer argument on the stack argument. 

If a specification specifies nondeterministic operations, the requirement that an 
implementation of a nondeterministic operation must have maximum amount of 
nondeterminism specified by the specification is too strong. (In ease. of the specification of 
Set-Int given in Figure3.1, such a requirement would mean that the procedure . 
implementing the operation Choose must be able 'to nondeéterministically pick any element 
of the set.) It is more appropriate to leave it to the designer of an implementation to decide _ 
how much nondeterminism a procedure implementing a nondetermiinistic operation should 
have: The procedure when viewed on ‘abstract’ values of the data type could be either 
deterministic, returning a fixed result out of the many possible: choices specified by the 
specification for an input, or it could exhibit limited: nondeterminism or maximum amount 
of nondeterminism specified by S, returning a subset of the set of possible’ results specified. 
For example, a correct implementation with respect to the specification of Set-Int can have 
the procedure implementing the operation Choose return the maximum integer in the set, 
say, or it could have the procedure itondeterministically pick between the minimum and 
thaximum integers in the set, etc. As is discussed later, a'determifiistic procedure can also 
simulate nondeterministic behavior on ‘abstract’ values ‘by returning different values on 
different values of the rep representing the same ‘abstract’ value of D. We call such a 
procedure pseudo-nondeterministic. | 


' 5.1.1 Semantics of an implementation 


By a procedure, henceforth, we mean a procedure in an implementation I of D 
implementing an operation of D, unless stated otherwise; by a constructor procedure and 
-an observer procedure, we mean a procedure implementing a constructor and a procedure 
implementing an observer, respectively. We use the name of an operation of Di in S written 
in capital letters, as the name of the procedure implementing the operation in I. Outside I, 
we use an operation name instead of the name of the procedure implementing the 

operation to signify that the data type is being used abstractly. 
As data types are used abstractly in an implementation, the semantics of an 


- 180 - 


implementation I is a set of implementation algebras. These algebras can be constructed 
hierarchically as in Chapter 2; we use in the construction, the implementation algebras 
serving as the semantics of the implementations of the data types being used in I. Like a 
type algebra, an implementation algebra has a domain corresponding to every defining 
type D’ € A, which is defined by an implementation algebra of an implementation I of D’. 

The domain corresponding to D is. in.general a. subset of a domain corresponding 
to the rep defined by an implementation algebra of an implementation les _ of the rep. It 
consists of the values of the rep used to. represent the values of D. The subset is 
characterized by a formula Inv(7) with exactly one free variable r of the rep type. The 
formula Inv(7) represents the strongest unary relation on,the values of the rep preserved by 
the constructor procedures in I. It captures the minimality property of the implementation, 
namely that a value of the rep that represents a value of D can be. constructed by finitely 
many applications of the constructor procedures and that these values. constitute. the 
smallest subset closed under the constructor procedures, 

Let F'(I) stand for the semantics of I. This set can be defined ‘idiwediesy: We 
assume that a set of primitive data types supported by the: host programming language are 
implemented correctly with respect to their specificatians by its compiler., The semantics 
of the specifications of such primitive types serves as: the basis step of the inductive 
definition. If one wishes to prove the correctness of. the. implementation of a primitive 
type, the primitive type of the language in which the compiler. is. coded would then serve as 
the basis. . 

In the inductive step, an implementation algebra: A in F'(H)-has the following 
structure: . 

=[{V) }ULV,,1D' € 4}, EXV: {i Joe a}] 

={ vIVeVE, A Inv(v) }, where vi is defined by an implementation algebra in 
2 rep) for an Saneméntniee liep of there rep. For each D' € A, V,) is defined by an algebra 
in Fi, ) for an ae: ly of D’. The specification of the procedure 
implementing o is an abstract specifi cation of i i,- 

In the next section, we discuss how to construct Fi) after the discussion about 
the implementation structure and about Inv(/). | 


- 181 - 


5.1 .2 Correctness Method 


If we consider specifications not specifying any nondeterministic operations, then 
the correctness criterion is simple: Fl) C. RS). So, -to. prove. the correctness of, an 
implementation I, we need to show that every implementation algebra in F'(I) is also in 
F(S), which can be done using the method discussed in Section 3.2 to show whether a type 
algebra is in FS). Two main steps of this method are: 

(i) Construct the observable equivalence relation on V i as discussed in Sections 2.2 and 
2.3, using the observable equivalence relation on V,,. corresponding to-each defining type 
D’ € A and the observable equivalence relation on Viep “and 

(ti) interpret the axioms and restrictions in the algebra, and show that they are satisfied. 


Since the set of observable equivalence relations is a congruence, the observable 
equivalence relations must be preserved by the procedures. The observatile equivalence 
relation is the largest such congruence on the algebra. | . 

The above discussion is the formal basis of the correctness method proposed by 
Guttag et al. [29] and Kapur [40], The observable equivalence relation on the domain - 
cotresponding to D is Guttag et al.’s equality interpretation. The above method in fact 
extends the methods in [29] and [40] because it can handle procédures signalling exceptions 
as well as nondeterministic procedures implementing deterministic operations. 

Note that if there exists a correct implementation I’of S, then S is consistent, 
because then F{S) is not empty. This is the basis of Guttag and Horning’s statement [28] 
that one way of showing consistency of S is to design’a correct inpfementation I of S. 


2. A nondeterministic procedure can implement a deterministic operation if all possible results of the 
procedure on every input are observably equivalent. 


- -182- 


5.1.2.1 Nondeterminism 


For a specification S specifying nondeterministic operations, the criterion that 
F(I) C AS) is too strong as it rules out implementations with pseudo-nondeterministic 
procedures which ought to be correct. In such an implementation, a nondeterministic 
operation is implemented either as a deterministic procedure or as a nondeterministic 
procedure that does not preserve what should be the observable equivalence relation on the 
values of the rep. It returns different values when applied on different rep values 
representing the same ‘abstract’ value of D, but every value returned is‘a possible result 
specified by S on the input; nondeterministic behavior of an operation is realized in this 
way. If we take the largest equivalence relation on the rep.values that is preserved by the 
procedures as the interpretation of = in the implementation (which. is so in case of 
specifications not specifying nondeterministic operations); the axioms and restrictions in S 
may not hold for such an implementation. However if an equivalence relation preserved 
only by the procedures implementing, deterministic operations is taken as the observable 
equivalence relation, then the axioms and restrictions hold in S. , 

Consider for example, the implementation of Set- Int ina -CLU-like language 
given in Figure 5.1. The procedure CHOOSE is deterministic and returns the first element 
of the sequence value used to represent the set argument.. The largest equivalence. relation 
on the sequences preserved by all the procedures is the. identity relation, and Jit can be 
shown that the axioms of the specification of. Set-Int do-not hold if the identity relation is 
taken as the observable equivalence relation.. However if we take the relation . 

Eqv(sl, s2) = (‘Sl$Size(s1) = Si$Size(s2) ) A (v i) [ ING, i) = IN(s2, i) ], where | 
IN(s, i) = (A) [ 1. < j < SI$Size(s) A Si$Fetch(s, j) = i], 
and SI stands for the data type Sequence of Integers, as the observable equivalence relation, 
then the axioms hold. The procedure CHOOSE returns 1, for example, on the sequence 
Addh(Addh(New, 1), 2) and 2 on Addh{Addh(New, 2), 1), so CHOOSE behaves differently 
on members of the same equivalence class of sequences representing the same set value. 
CHOOSE is an example of a pseudo-nondeterministic procedure. Oe . | 

‘To fully illustrate the correctness method," ‘we discuss two variations of the 

implementation in Figure 5.1 differing in the implementations of Choose. In the first, 


- 183 - 


Figure 5.1. An Implementation of Set-Int 
SET-INT = cluster is NULL, INSERT, REMOVE, HAS, SIZE, CHOOSE ~ 
rep = SEQUENCE-INT 


NULI. = proc{) retums (cvt) 
return (rep$New()) 
end NULL 


INSERT = proc(s: cvt, i: Int) returns (cvt) 
if INDEX(s, i) < rep$Size(s) then return (s) end 
return (rep$Addh(s, i)) 
end INSERT 


REMOVE = proc(s: cvt, i: Int) returns (cvt) 
j: Int:= INDEX, i) 
if | < rep$Sizc(s) then return CepsRemhtrepsKeplaccts: j. rep$Top(s))) ) a 
return (s) 
end REMOVE 


proc(s: cvt, i: Int) returns (Bool) 
return (INDEX(s, i) < rep$Size(s)) 
end HAS 


HAS 


proc(s: cvt) returns (Int) 
return (rep§$Size{s)) _ 
end SIZE 


SIZE 


CHOOSE = proc(s: cvt) returns (Int) signals (no-element) 
if rep$Size(s) = 0 then signal no-clement end 
return (rep$Bottom(s)) 
end CHOOSE 


INDEX = proc(s: cvt, i: Int) returns (Int) 

c: Int:= 1 

while c < rep$Sizefs) do 
if rep$Fetch(s, c) = i then return (c) end 
c:=ctl 

end 

return (c) 

end INDEX 


Choose is implemented as a deterministic procedure CHOOSE’ which returns the 


maximum integer in the nonempty sequence; the procedure CHOOSE’ is given in 


- 184- 


Figure 5.2. In the second, Choose is implemented as a nondeterministic procedure 
CHOOSE” which returns the maximum or minimum integer in the nonempty sequence. 

CHOOSE” is given in Figure 5.3. The construct Select in the code of CHOOSE” behaves 
nondeterministically: Select(S1, S2, ..., Sn), where Si is a statement, arbitrarily picks one of 
the statements given as its arguments for execution. Note that neither of CHOOSE’ and 
CHOOSE” is pseudo-nondeterministic. a 


Figure 5.2. CIIOOSE 


CHOOSE’ = proc(s: cvt) returns (Int) signals (no-clement) 
if rep$Size(s) = 0 then signal no-clement end 
return (MAX(s)) 
end CHOOSE’ 


MAX = proc(s: rep’ returns (Int) 
m:= rep$Bottom(s) 
for i: = 2 to rep$Size(s) do 
if m < rep$Fetch(s, i) then m : = rep$Fetch(s, i) end 
end 
return (m) 
end MAX 


Figure 5.3. CHOOSE” 


CHOOSE” = proc(s: cvt) returns (Int) signals (no-element) 
" if rep$Size(s) = 0 then signal no-clement end __. 
Sclect(return (MAX(s)), return (MIN(s)) 

end CHOOSE” 


MIN = proc(s: rep) returns (Int) 
m:= rep$Bottom(s) 
for i: = 2 to rep$Size(s) do 
ifm >rep§$Fctch(s, i) then m : = rep$Fetch(s, )end 


- 185 - 


5.1.2.2 Definition of Correctness 


We can now state the correctness criterion. It has two parts. The first part deals 
with implementations not having pseudo-nondeterministic procedures, and the second part 
takes care of pseudo-nondeterministic procedures. In the second part, the equivalence 
relation used on the rep is not required to be preserved by the procedures implementing 
nondeterministic operations thus allowing them to be pseude-nendeterministic; : the 
equivalence relation is only required to be preserved by the procedures implementing 


deterministic operations. 


Def. 5.1. An implementation I is correct with respect to 4 specification S if and only if 
assuming that every data type D’ used in I has a correct imiphemensauiort I with niepert to its 
specification S’, i 
(i) Fd) C AS), or 
(ii) for every algebra A € FAI), there is a set of Sauyalence relations, 
E={E) ‘|D'€ aU {D} } UE, ,, such that 
(a) for every defining type D’ € A, “ED , is the equivalence relation on Vp , used to prove 


correctness of the implementation I,): ‘of D’, and similarly, Eis the choosed relation 


on Viep used to prove correctness of an implementation I . am the rep, 
(b) Egyy is the equivalence relation defined as lian: For an. exception name ex of 
arity D, X...X D if <v,, vir€ Ep SV, VE Ey. then <exy,,.. Vd, ex(v,. wad ")? Eb pyy » 
(CEL S E> 
(d) E is preserved by the functions corresponding to deterministic operations in A, and 


(ec) A/E€ FS).. i 


A/E is the quotient algebra of A induced by E except that E need not be a congruence; the 
function f’ in A/E corresponding to f, in A that does not preserve E_ behaves 
nondeterministically. The formal characterization above is complex because an 
implementation of a. defining type or tie rep could also have pseudo-nondeterministic 
procedures. 

In the correctness method, we do not explicitly construct the set F(I) of 
implementation algebras defined by I. We reason about the set as a whole by not using any 


- 186 - 


property specific to any particular implementation of D’ € A or of the rep, and by instead 
using the procedure specifications and the theories of the defining types and the rep. We 
show that the axioms and restrictions of S hold when interpreted in I by deriving them 
from the procedure specifications. 
Roughly speaking, the following steps need to be carried out to show correctness 

of an implementation: 

(i) Derive the specification of every piorewune in the larplementation as a function on 

rep values from its code.’ 
| (ii) Design a formula Inv(r) characterizing the subset of the rep values needed to 
represent the values of D. It must express the strongest unary felation preserved by the 
constructor procedures. | | , 

(iii) Design the aauivalenee Stsioh on ihe values of the rep sisstiine lav. The 
equivalence relation must be preserved by the procedures implementing the deterministic 
operations, e : 

(iv) Interpret the restrictions and axioms using. ! the: ne procedures in ieee of the-operations. 
Replace for a variable of type D, a. variable of. the rep Aype: satisfying Iav. Interpret = 
corresponding to D as the equivalence relation of step (iii). 5 


We discuss each of these steps in detail inthe next:two sections: The second section 
discusses the first two steps; the remaining steps and the éorrectness method are illustrated’ 
in the third section. We argue that a formula Wedker than Inv often suffices; furthermore, 
the equivalence relation needed in step (iv) is also often weaker than the strongest 
equivalence relation preserved by the procedures implementing the deterministic 
procedures. We also discuss what extra steps need to be performed if auxiliary functions 
are used in a specification. . = , 

For recursive and mutually re recursive implementations, there: is an additional step. 
in the correctness proof. We need to show that the fep (reps i in case of mutually recursive 
~ implementations) defined by a recursive domain equation(s) is nonempty, The rest of the 


proof is the same as in case of nonrecursive implementations. 


- 187 - 


5.2 Implementation Structure and Semantics 


Besides the procedures implementing the operations of D, an implementation I of 
_ D may include helping procedures needed in writing the procedures implementing the 
operations. For example, INDEX is a helping procedure in the implementation of Set-Int 
given in Figure 5.1. A helping procedure is not available outside the implementation, so 
we call it an internal procedure of 1. Let I stand for the set of alt internal procedures used 
in I. The procedures in I may also use types other than the rep and the defining types of D, 
if need be; we call such types internal types of I and denote the set of internal types in I as 
1. Note that the internal procedures and internal types of an implementation I are 
different from the auxiliary functions and auxiliary types used in its specification S. 

In this thesis, we do not wish to be concerned about the semantics of the control 
structures used in coding the procedures. There are at least two approaches to avoid 
considering the control structures, which are discussed below. However, we illustrate the — 
correctness method using only the translational approach. We have worked the correctness 
proofs using the ccher approach; the proofs in that case are similar in flavor to the proofs 
using the translational approach. These proofs are not presented in the thesis. We believe 
that the correctness method would work using any approach for specifying the procedures. 

Most programming languages suppofting user defined data types provide a 
mechanism that encapsulates a collection of procedures implementing the operations of a 
data type and provides an abstract view of data outside the mechanism, for example, 
~ cluster in CLU, form in ALPHARD, etc. The encapsulation mechanism constrains the use 
of the procedures. We discuss below the properties desired of an encapsulation mechanism 
that facilitate the correctness proof of an implementation. Finally, we discuss how we get 
the semantics of an implementation I as a set Fl) of implementation algebras to complete 


the formal aspects of the correctness method. 


- 188 - 


5.2.1 Procedures - Approach | 


In Chapter 4, we discussed a method based on Floyd-Hoare approach for 
specifying a procedure. In this method, a procedure is specified as a set of formulas 
relating its input to the result(s) returned by it. The procedures implementing the 
operations in an implementation I can be specified in this. way; the specifications of 
internal procedures are not included if they are: not referred in the specifications of the 
procedures implementing the operations. A procedure is specified as a transformation on 
the values of the rep. To verify the correctness of a procedure with respect to its 
specification, the theories of the defining types, the rep, and the internal types are used. 

Figure 5.4 is the specification of the procedures in the implementation of Set-Int 
given in Figure 5.1 using this method. [1 also has specifications of CHOOSE’ and 
CHOOSE”. Instead of using the procedure invocation itself. to. stand for the result (or a 
possible result in case of a nondeterministic procedure), we have introduced, for 
convenience, a name. for the vesult. For example,.the specification of the procedure 
REMOVE uses to stand for the result.of REMOVE on inputs s.and i. The specification 
captures that 

(i) if the integer argument i is in the sequence argument s, then r is.the sequence obtained 
by first replacing the first occurrence of i ins by the topmost element i in the sequence and 
then getting nd of the topmost element; otherwise, 

(ii) ris s itself. . In deriving these. specifications, we have used the specification. of the data 
type Sequence-Int given in Appendix IV. 


5.2.2 Procedures - Approach Il 


We translate a procedure implemented in a rich imperative programming 
language to a simple applicative language similar to the specification language proposed in 
Chapter 3 using the method suggested by McCarthy [56] (see [54] where the method is well 
explained). Use the translated procedures to prove the correctness of the implementation [. 
Guttag et al. [29] and Kapur [40] take this approach; they use a language supporting 
conditional expressions, composition, recursion, and the use of auxiliary functions. 


- 189 - 


Figure 5.4. Specification of the Procedures in the Implementation of Set-Int Using Approach I 
NULL(): (=r) 
r= rep$New() 


INSERT(s, i) : (=r) 
(In(s, i) => r=s)A(~ Ins, i) > r= rep$Addh(s, D) 


REMOVE(s, i) : (=r) 
plissfJAWP[P<j>~iz=s[pv]]A 
r= repsnemn(repshenlacets, is rep$Top(s)))} V (~ in(s, i) > r= s) 


HAS(s, i): (= b) 
(b= T) & Ins, i) 


SIZE(s) : (=) 
i= rep$Size(s) 


CHOOSE(s) : (=i) 
rep$Size(s) = 0 => CHOOSEIs) signals no-element() - 
rep$Size(s) >0 => i= s[1] 


CHOOSE"(s) : (=1) 
rep$Size(s) = 0 => CHGOSE’(s) signals no-element() 
rep$Size(s) > 0 => (In(s, i) A (V j) [ 1 Si < eerie): => stil < '] ) 


CHOOSE’’(s) : (=i) 
rep$Sizels): = 0 => CHOOSE(s) signals no-etementQ. 
rep$Size(s) > 0 = (in(s, i) A (Vj) [1 <j < rep$Size(s) = s[j] <i] 
V(VP[1 <j < rep$Size(s) > i <sfj} ), - 


where in(s, i) = (3j)[1<j < rep$Size(s) As fi} = 1] 
: in'(s,i) = (2). 1<i < rep$Sizels) As fi] =i AMV i) [i<j => ~i= sfi']]) 


We use an extended applicative language that has a signal primitive and guarded 
expressions in addition to composition and recursion mechanisms, and the use of auxiliary 
functions, so that the procedures signalling exceptions and exhibiting nondeterministic 
behavior can be specified. Conditional expressions can be simulated using guarded 
expressions. The translation method proposed by McCarthy can be extended to deal with 
the exception handling mechanism and the nondeterministic. construct in a programming 
language. . 

An expression is similar to a term; it uses procedure names implementing the 


operations, internal procedure names, the auxiliary procedure names introduced during the 


- 1990- 


translation, and terms. 
The signal primitive takes arbitrarily many (nonzero) arguments; its first 
argument is an exception name, and other arguments are expressiofis of various types. Its 
syntax is signaKex, e Ceres & ), where ex is an exception name with ‘arity D, , ee D, 
and each e, is an expression of type D.. | — | . 
A guarded expression is similar to. Dijkstra’ s guarded commands; its — is 


 <guarded expression> :: = <expression> | <alternative> [W <alternative> ]° 

<alternative> ::= <condition> = <guarded expression> 

<condition> : = Cboolean expression>, : 
where [ X l stands for zero or finitely many repetitions, and the symbol vr stands for 
nondeterministic choice among varigus alternatives. 1a guarded expression is simply an 
expression, then its semantics is that of an expression. Otherwise, if'a guatded expression is 
a collection of alternatives, then for an instance of its variables, its semantics is the 
semantics of the guarded expression of an arbitrarity chosen alternative whose boolean 
condition is T. If every alternative has its condition as F, then the semantics of the guarded 
expression is undefined. A guarded: ‘expression exhibits. norideterininistie: behavior because 
for an instance of the variables, there are i in general many alternatives: whose condition is T, 
and one such alternative is arbitrarily chosen. > 

We translate the procedures i in the. implementation of Set‘Int in Figure 5.1 to the 

above applicative language. Figure 5.5 is their translation; we have also included the 
translation of the procedures CHOOSE’ and CHOOSE” as well as. of the internal 
procedures MAX and MIN. In translating the internal procedure’ INDEX, the auxiliary 
function f is introduced to simulate the effect of the while loop used in INDEX. Similarly, 


_ 3. An alternate approach to introducing guarded expressions for specifying the nondeterministic bchavior of 
a procedure OP is to specify its non-exceptional behavior using a deterministic boolean auxiliary function 
OP_P. similar to the function o_p corresponding to a fondeterministic operation o as discussed in the 
previous chapter. For an input on which the nondeterministic procedure returns a normal valye, .the 
corresponding auxiliary function holds for all possible values returned by the procedure on that input and 
docs not hold for other vatucs. ‘Then the procedures can :be specified ‘ising conditional expressions and 
recursion. We have adopted the above approach for specifying the procedures, because it is direct =~ 


simple. 


- 191 - 


the auxiliary procedures f and f” are introduced to simulate the for loop in MAX and MIN 
respectively. 

Cartwright and McCarthy’s first order semantics of recursive programs [8] can be 
used to prove properties about the procedures weitten. in. the above applicative language. 
The recursive definition. of a procedure. is. considered: as ‘an axiom defining the function 
computed by the ‘procedure. Because of the nondeterministic behavior of a guarded 
expression, we have to be careful in using.such. an.,axiom, of we will run into 
inconsistencies. For a particular instantiation of variables: #9 the “axiom, :we-use every . 
possible alternative whose condition is! T, and we doi nbt-relnte any two alternatives whose 
conditions are T. For example, for CITOOSE”’, ‘there dre ‘two alternatives, MAX(s) and 
MIN(s), for the case (~ rep$Size(s) = 0). We do not cailate MAX(s) to MIN(s), as relating 
them can cause inconsistency. The termination of-a procedures | is proved separately either 
using the method suggested by Cartwright and MéCarthy, 0 or the method based on well 
founded ordering [14]. 

The translational ‘approach . Js. purely: based: “00, 1 the 3 eran of the control 
structures of the host programming ‘language in’ terms OF thé } primitives of the applicative 
language incorporated into the translation method’ "Tfé’ propertiés of the types involved in 
the implementation can be used in simplifying the resulting translations. 


5.2.3 Properties of the Encapsulation Mechanism 


As was stated earlier, in most of the progranifiing tahguages ‘supporting user 
defined data types, an implementation of a data type 4s at-pacaysuletion of the procedures 
implementing the operations that ‘disciplines thelr, use. Such. an implementation is 
protected: A procedure implementing an operation of: cannot. be passed. any arbitrary 
value of the tep as a ‘representation of a value of. D; “rather “only ‘a-value of the rep 
constructed earlier as a representation for a valye of D by the. constructor. procedures. of D 
can be passed. : Every. value of the rep need? ‘NOt in general beused to Teprésent a value of D. 
The procedures are invoked only on those values : of tie. ree 4 which can be constructed by 
finitely many applications of the constructor procedures of D. (For example, the procedure 


REMOVE in the implementation of Set-Int in Figure 5.1 is never passed a sequence having 


- = 192- 


Figure 5.5. Translation of the Procedures in the Implementation of Set-Int 


NULL () a rep$New() 
INSERT(s, i) 4 INDEX(s, )d< rep$Size(s) =>s 
: (~ INDEX(s, i n< rep$Size(s)) => rep$Addh(s, ) 


REMOVE(s, i) & INDEX(s,i)<rep$Size(s) => 
rep$Remh(rep$Replace(s, INDEX(s, i), rop$Top(e))) 
(~ INDEX(s, i) < rep$Sizé(s)) => s 


HAS(s, i) 2 INDEX(s, i) < rep$Sizefs) — 

SIZE(s,1) © -rep$Sizele) 

CHOOSE(s) . a 1ep$Siza(s) = 0 = signaline-efement).{ . 
. (~ rep$Size(s) = 0) => fep$Bottom(s) 

INDEX(s, i) & ffs, 1,1) 


CHOOSE'(s) © rep$Sizels).= 0 => signaKno-element) § 
(~ rep$Size(s) = o) => MAX(s) 


maxis) & £(s, rep$Bottom(s), 2) 
CHOOSE’(s) 4 rep$Size(s) = 0 => signal(no- element) . 


4* rep$Sizets) 2-0) S> MAX) f 
(~ rep$Sizels) = 0) =2 MING) _ 


mints) & 14s, rep$Bottam(s), 2). 


Auxiliary Functions 


f : rep X Int X int > Int 
f? : rep X Int X Int — Int 
“P prep KIALXAnt ~+ Int. 


“ts,i,c) & wo <ropssizels): oof os 
(ic< rep$Size(s)) A (rep$Fetch(s, c) 2 2 0. => cl. 
“(cS rep$Size(s) A ~ i scadnbutiee cys 1? rte, e+) 
f(s, m, c) 2. {~~ o< rep$Stze(s)) => mg eae i 
((c < rep$Size(s)) A (m< repsFaten(s, el) =, las rep$Fetch(s, c), c +1) TT 
ks rep$Size(s)) A (~ m< renee; on= => > Fl, m, c+ ey 
fs, m, ¢) a ie c < tep$Size(s)) => m : 


(lc <-rep$Sizels)} A.(m> feokeaienlie om => £"(s, rep$Fetch(s, c), c+ 1) ff 
le < rep$Size(s)) A (~ m> repS$Feichis, cM): => Bie m, c+ 1) 


-. 193 = 


multiple occurrences of an integer, as such a sequence cannot be, censtructed using NULL, 

INSERT and REMOVE.) We are interested in the behavior of the procedures only on this 
subset of the values of the rep, The subset is characterized by the formula Inv(7) discussed 
in the previous section, which expresses the strongest unary relation on the values of the 
rep preserved by the constructor procedures of D, Inv(7) is expressed without alluding to 


any particular implementation of the rep type. 


Def. 5.2 A procedure OP pee a constructors : 2D, xX. X Dd. +D reserves Inv 
if and only if 
whenever ((V I< i <n) [D. = D => inv[x] 1). then in oe Maal 
(i) if OP(x, ..-, x) returns a normal value, InvfOP(x, . .. » xi; otherwise, 
(i) if OP(x,,..., x.) signals ex(e,,..., ¢,), then for each 2 of type D, five}. 
If OP is nondeterministic, all possible results returned by OP must satisfy Inv. # 


For the implementation of Set-Int given in Figure 5.1, Inv(s) is 

(VEDIC <i < repSSize() Ai 4’) sil # STE 
where s[i] is an abbreviation for trep$Fetch{(s, 1). It cin Verified that Inv(s) is preserved by 
the constructor procedures of Set-Int. Figure 5.6 is a proof that REMOVE preserves Tav(s), 
the most difficult among the three cases. Any ieee stronger than the one above i is not 
presence by the constructor procedures. . 

~ Inv may be difficult to deduce from a complex ‘implementation, but the designer 
of an implementation usually has a good idea about what Inv is. [fi the correctness proof, 
Inv is usually not necessary; a weaker property may suffice. In case Inv is available, a 
property of the representing values needed in the correctness proof can be deduced directly 
from Inv. Otherwise, if Inv is not available, then the property can be deduced by checking 
whether the property is preserved by the constructor procedures: since Inv is the strongest 
unary relation preserved by the constructor procedures, any’ pry relation reserved by 
the constructor procedures is implied by Inv. 

Ifa module implementing an abstract data type in a programming language is not 
protected, as would be the case if abstract data types are simulated in PASCAL or PL/I, 


say, then 


-~ 194- 


Figure 5.6. Proof of REMOVE Preserving Inv 


Assume Inv(s) holds. To show that Inv/REMOVE(s, i)) holds. 
If type name is not included in the operation names below, we assume that the operation are of type rep. 
There are two cases, 


Case 1: INDEX(s, i) < Size(s) 
Size(s) > 0<= => T, from the specification of INDEX 
Inv(REMOVE(s, i)) = Inv(Remh(Replace(s, INDEX(s, i). Top{s)))), from the specification of REMOVE 
It can be shown using the specification of INDEX and the theory of Sees te that 
(i) €Inv(s) A.0<k < Sizcfs) A s’ = Replace(s.k, pee 
(((W kI)[1 <kI < Size(s) A ~ k = kl] => ski] = qkl]) A sid s i) 
(ii) (Size(s) >0 As’ = Remh(s)) => (VK)[(lL<k < Size(s ) => (s Tk] = sfk] A Size(s’) = Size(s) -1)] 
Using (i) and (ii), we have Inv(REMOVE(s, i))} «= T es 
Case 2: ~ INDEX(s, ) < Sire(s) 
Inv(REMOVE(s, i)} <>. Inv(s), from the specification 0 of REMOVE 


eT 


(i) restrictions must be imposed on. the global v variables, af. any, as well as‘on the use of the 

procedures implementing the. epee 40 ensure. the. minimality, property of the 
implementation, and 

(ii) Inv must be preserved wherever a procedure implementing an operation is invoked. 
Such a proof is likely to be global and complex. (Guttag [31] discusses restrictions on the 
Euclid implementation module to ensuse that the module satisfy the minimality. property.) 
In the following discussion, we assume that. the semantics .of a,mechanism. encapsulating 
the procedures implementing the operations. of a_ data type ensures. the minimality. 
property. 

__ It is not necessary for the procedures t to terminate over their entire ‘apt domain 

if Inv(9) is other than T. To prove total correctness of an implementation, it is sufficient 
that a procedure implementing an operation o that has its i-th argument x, to be of type D_ 


terminates whenever Inv[x,] holds. 


- 195 - 


5.2.4 Semantics of an Implementation 


Now that we have the procedure specifications, we can construct the 
implementation algebras of I using them. Since procedures specifications may use internal 
types and internal and auxiliary procedures, we first construct the extended 
implementation algebras and then derive the implementation algebras from them. For 
every possible implementation I' of a type D’ used in the implementation I, we have the set 
of its implementation algebras. In an implementation algebra of I, the domain 
corresponding to D’ ts the domain defined by an implementation algebra of I’. An 
extended implementation algebra A! of I has the following structure: 

A'=[{V)} ULV) 1D'e aul}, EXV; {i,|o€ QUT, }].4 | 
Vy ={vjveV re A Inv(v) }. The function i, is the interpretation of the specification of 
the procedure corresponding to o in A'. From A'‘, we get an implementation algebra A 
A=[{Vj,}UL{V) 1D’ €4}, EXV; {i, Joe a}], 


4. In addition to the internal procedures, I, is assumed to include the auxiliary procedures needed in the 
translation of the procedures into the applicative language discussed above. 


- 196 - 


5.3 Correctness Method 


We describe the remaining steps of the correctness method outlined in 
Subsection 5.1.2. For completeness, we repeat the steps discussed in the previous section 
about the termination of the procedures and the preservation of the formula Inv. For a 
specification specifying nondeterministic operations, we discuss the method for three cases: 
- An implementation of a nondeterministic operation is (i) a deterministic procedure, (ii) a 
nondeterministic procedure, and (iii) a pseudo-nondeterministic procedure. We first use 
the implementation of Set-Int given in Figure 5.1 with CHOOSE replaced by CHOOSE’ 
for illustrating the method for the deterministic case. Later, we use CHOOSE” as the © 
implementation of Choose to illustrate the method for the nondeterministic case, and 
finally, we use CHOOSE to illustrate the method for the pséudo-nondeterministic case. 


- §.3.1 Auxiliary Functions in a Specification - 


If a specification S uses auxiliary functions and auxiliary types, we extend an 
implementation I to include the implementations of the auxiliary functions in the 
correctness proof. We include in the specifications of the procedure of I, the specifications 
of the implementations of the auxiliary functions. For showing the correctness of I, we use 
the extended implementation, instead of I in the following steps; an auxiliary functions is 
treated like an operation. In the following discussion, whenever we say I, we mean the 


extended implementation if S uses auxiliary functions. 
5.3.2 Preservation of Inv 


If the formula Inv(7), which characterizes the subset of values of the rep used to 
represent the values of D, is available, verify that Inv(7) is preserved by every constructor 
procedure. We showed in the previous section that for the implementation of Set-Int in 
Figure 5.1, its Inv is preserved by every constructor procedure. 

If Inv(7) is not available and cannot be guessed easily, we iaaporaaly assume that 
every value of the rep is being used to represent the values of D. In the derivation of the 
axioms and restriction of S from the procedure specifications, in case we neéd any property 


- 197- 


P(7) of the rep values, we deduce P(7).by showing that Pi) is preserved by the constructor 
procedures of D, as in that case Inv(r) would imply P(7). 

In the derivation of an axiom or a restriction in S from the procedure 
specifications, a variable of type D is instantiated to'a-value of the rep satisfying Inv(7) (or 
P(/) if Inv(s) is not available), . | 


5.3.3 Termination of Procedures 


Prove that every procedure in I is total on the arguments it can expect, ie., if an 
argument to a procedure is of type D, prove that the procedure terminates if these 


arguments are values of the rep satisfying Inv(v). 
5.3.4 Proving eon and Axioms 


Show that every restriction in S specifying the e exceptional behavior al every 
axiom in S specifying the normal behavior can. ‘be. derived from the specifications of. 
procedures i in]. The operation symbols and. the auxiliary function symbols i in the axioms. 
and restrictions are replaced by the names of procedures implementing t them. The theories — 
derived from the specifications of the defining types, the TED. and. internal types can be 
used in the derivations. agi. ent apis Aes . 

The symbol = in Si is interpreted 2 as the observable equivalence relation. =p is 
usually interpreted as the largest equivalence relation on the values of, the rep satisfying Inv, Ss 
preserved by the procedures. ‘The exception is the case when a nondeterministic operation 
is implemented as a pseudo-nondeterministic procedure. Thea, the observable equivalence 
relation serving as the interpretation of = 2p! is ‘required to be preserved only by the 
procedures implementing deterministic operations, and it need not be the largest such 
equivalence relation. 


. = 198 - 


5.3.4.1 Preservation of Equivalence Relation — 


A deterministic procedure OP implementing an operation # : D, x...x D.— D’ 
preserves an equivalence relation on the rep values, expressed :as a_first order formula. 
Eqv(s,, s,), where s, and s, are of rep type, and are the only free: variables in the formula, if 
and only if for each 1<i<n, ([ D=D=> Eqv(x,, y)] A [D, #D=>x= y]), either 

(i) ‘OP(x,,... x,) signals ext, holds and ‘OP(y,,.:, y¥ signals ext,’ holds such that 
“ext, = ext,’ is provable. In addition to the rules discussed in the previous chapter, we 
have: For an exception name ex of arity D; X...X D,. if for every D; = D, Eqv(x;, ¥) and 
for every D: # D, x = y, then eX’... X ‘ys ex ie Je ,) is provable. Or, | 

(ii) if D' = D, then ‘Eqv(OP(......, x), OPYy,.. ss r)) is ‘provable, and if D’ # D then 
‘OP(x,,.. x)= =)’ OP(),,..., yy is provable. 


If OP is nondeterministic then (ii) above is modified to be: If D’ = D, then for every 
possible result 2 returned by OP(x.,..., x): OPG,... y, Ycan return r, such that Eqv(r, 1) is 
_ provable, and vice versa, and if Db # D, for every r returied by Ok _ x), OP(,,. oy Ye) 
can return I, such that ‘ =p fr, is provable and vice versa. | 

. For example, Eqv(sl, $2) for the ape eats of Set-nt in elec 5.1 with 
CHOOSE’ replacing CHOOSE is 

(Sl$Size(s1) = SI$Size(s2)) A (v i) [ IN(SI, i) = IN(s2, ) |, where 7 

IN(s, i) = GDI1 <j <Si8Ste(s) A siti} | 

It relates sequences that are permutations of each other. Eqv is preserved by every 
procedure implementing an operation ‘of Set-Int. Figure 5: 7 has’ the. proofs for the 
procedures INSERT and HAS. Eqns, Ss ) i is the largest eqilivalerice relation preserved by 
the procedures. Any equivalence relation stronger than Eqv would have t to relate Sequences 
that are not permutations, and is thus hot preserved by HAS. eee | 


- 199 - 


Figure 5.7. Proofs that INSERT and HAS Preserve Eqv 


For INSERT 
assume Eqv(s1, s2), to show that (V i) Eqv(INSERT(s1, i), INSERT(s2,, i) 


Case 1: INDEX<(sI, i} < SI$Size(sl) = T 
Using Eqv(sl, s2), we have INDEX(s2, i) < SI$Size(s2) = T, so 
INSER'I(s1, i} = sl, INSER'T¢22, i) = 2, 90 Bav(INSERTEL, 4), INSERT(s2, cdl 


Case 2: INDEX(sI, i) < S1$Size(s2) = F Pd 4 
Using Eqv(sl, s2), we have INDEX(s2, i) < SI$Size(s2) =. F. so 
INSER(s1, i) = Addh(sI, i), INSER'T(s2, i) = Addhfs2, i), so = 
Fav INSER T(s1, i),- INSER 1(s2, i)) <= Eqv(Addh(sl, i), Addh(s2, i)) «= T 


For HAS 


From the semantics of INDEX, we have 
(i) INDEX(s, i) > 0 = T, 
(ii) INDEX(s, i) < SI$Size(s) => s [INDEX(s, i) si i, 
(iii) INDEX(s,-i) > SI$Size(s) => L(V 9.1 <j <.SISSize(s)) => ~s [jf ai] 


assume Fqv(s],.s2), to show (V i) HAS(s}, i) = HAS@2, i). - 
HAS(s], i) = INDEX(s1, i) < SI$Size(s1) 


Case 1: INDEX(s1, i) < S1SSize(s1) =T 
sl #INDEX(s1, i} = i Hs 
Using Eqv(s], s2), we get (A [0 <j < ; S18Sina(s2) A 2 w= = ile 
INDEX(s2, i) < SI$Size(s2) = T 
HAS(s1, i) = HAS(s2, i) == T 


Case 2 INDEX(s1, i) < SI$Size(sl) = F 
Using Eqv(s1, s2) and the above facts about INDEX, we get 
INDEX(s2, i): < S#$Size(s2) ss F, so. 
HAS(s1, i) = HAS(s2, i) = F * 


5.3.4.2 Restrictions 


Fora restriction specifying a required CaceaOn concicor of o, 
R; (X) = ofX) signals ext ' : 
show that whenever P,(X) and R,(X) interpreted in I hold, the procedure OP 
implementing o must signal ex/. For example, the specification of Set-Int specifies the 
following required exception condition for Choose in its restrictions component: 
#(s) = 0 = Choose(x) signals no-clement(). . . 
So the procedure CHOOSE’ must signal no-element() when SIZE(s) = 0 


- =-200- 


i SISSIZE(s) = 0) holds, which is indeed so (the siecondition specified for Choose i is T). 
For a restriction associating an optional exception condition with o, 
o(X) signals ex! => 0,), 
show that whenever the procedure OP implementing o: “signals ex, P g(*) and O0,(%) 
interpreted in I hold. For example, the specification of Stk-Int given in Figure 3.2 species 
the following optional exception condition for the operation Pash:. 
- Push(s, i) signals overflow(s, i) => #(s)> 100. ene 
In an implementation of Stk-Int; if the procedure implementing Push signals overflow, then . 
the size of the input stack must be > 100. 
We must also show that (i) if an input to a procedure OP implementing .an 
operation o satisfies its precondition, does not satisfy the condition, for any of its required 
exceptions or optional exceptions: then the procedure terminates normaly: : Let 
CX) = (PAD A (~ ROD A = A RAXVA (+ OHA ~ A ~ 0,00), | 
where for 1<i</, R; is the condition when o is required to signal ex, , and for Isi<m, O. is 
the condition when o has the option to. signal an exception ext, We show that CE): implies 
vormatX) Where TC (X)is the weakest input condition for oP. to terminate normally. 
For anil for every procedure in the pene of Sein, the aveNe concenon! is 
satisfied. | 
If a nontrivial precondition P, is specified for a constructor ®, then a procedure 
OP implementing o either signals on input X not satisfying P, ‘ or réturns a rep value 
which can be. constructed by a constructor procedure using an input satisfying. its 
precondition. For example, a correct implementation of Stk-Int can have the procedure 
implementing Pop return a stack when applied on an empty stack. If the procedure 
implementing Push signals. overflow on a stack. of size 128, say, then the. procedure 
| implementing Pop can only return any stack of size < 128... It cannot return a stack of size 
1000, say; allowing it to do so would be meaningless. 


- 201 - 


5.3.4.3 Axioms 


In the derivation of an axiom, we ensure that (i) for every occurrence of a 
procedure name OP implementing the operation o, the input to OP must satisfy the 
precondition P associated with o, and (ii) no subexpression signals any exception. 

If an axiom is an equation of the form * = ” we prove that its interpretation in 
I is derivable. If é, and é, are of type D, = is nts as Eqv; otherwise, the 
interpretation of el = e2 in I can be derived using. the theories constructed from the 
specifications of the rep, the defir ining types, and internal types. 

If an axiom is of the form ‘e, = if b then e,, "we have to prove that ‘b => ese, 
when interpreted in I is derivable. Similarly, for an, axiom 4 =if b then e, else ey " we must 
prove that ‘b => e, =e,’ and “~ b= c= e, are derivable in 1 Recall that the cbndition bis 
assumed to behave deterministicaly even.when' it involves nendeterministic operation 
symbols,- Figure 5.8 is a proof that the then’ partof:the axiom, 

Removetinsert(s, il), i2). = Hf il-= i2 Them. Remeve(s, et peartitome iets i), m, 
is derivable.- The derivation of the else clause, . 
.{~ il = i2) = Remove(Insert(s, i), .i2),= scaibmaaeie 2), il), 
__useS @ property of the representing yalues that 


(VDE repSSie(s) > 0 A Is, 9) > GAB LH Si < epSSiels)A l= i}k 


Figure 5.8: Proof that an Axiom of Set‘Int is Derivable 


il =i2 => Remove(Insert(s, il), i2) = Removes, i2)_ 
Assume il = i2, to show Eqv(REMOVE(INSER’ Ms, i, i2), REMOVES in) 


Case 1: INDEX(s, il) < rep$Size(s) = T 
INSERT(s, il) = s, so the above holds. 


Case 2: INDEX(s, il) < rep$Size(s) = F 
Letr = INSERT(s, 11) = Addh(s, il) 
Using il = i2, INDEX(r, i2) = rep$Size(Addh(s, in), so 
REMOVE(r, i2) = s, and 
REMOVK(s, i2) = s, so the above holds. 


- 202 - 


which i is preserved by the constructor procedures. > 

. The axiom ‘Choose{(s) € s = T under the condition ‘~ Size(s) = 0, when 
interpreted in IF is ‘HAS(CHOOSE'G), s) = T. ~ This is derivable, because 
‘INDEX(MAX(s), s) < < rep$Size(s) = T is derivable. The remaining axioms in the 
specification of Set-Int can also be shown to be derivable. . 

The above five steps constitute the correctness method. If an implementation I 
can go through the above steps, it is correct with Fespect to Ss. ,For example, the 
implementation of Set-Int given in Figure 5. 1 with cl lOOSE replaced by Ch IOOSE’ goes - 
through the above steps, and is thus correct. - 


5.3.5 Nondeterministic Procedures 


We .now. consider the case when an implementation has a nondeterministic 
procedure implementing an operation specified: to be‘ nondeterministic by S. We have 
already discussed the conditions for a nondetéerministic procedure! te preserve Inv-and the 
equivalence relation Eqv. Various steps-in the cofretness proof discussed above remain. 
the same except that if an aXxiomt involves the nondeteiminitic procedure, we'must use the 
interpretation of formulas involving nondetefmimistic -function | symbols - discussed ‘in 
Chapter 4. In addition, it-must“be ensured’ that for ay: put; the: nondétenministic 
procedure does not have a choice of signalling as well as terminating normally. 

For example, if we consider the implementation of Set-Int in Figure 5.1 with 
CHOOSE replaced by CHOOSE”, most of the-above proof remains vatid: We have to 
show that the axiom ‘Choose(s) € s = T” is derivable under the conditien:'~ Size{s) = 0,’. 
That is, if ‘rep$Size({s) > 0° holds, then | 

HAS(s, CHOOSE") =T o ere) 
is derivable. CHIOOSE”(s) can either return MAX(s) 0 or r MIN(S). For both h possibilities, (*) 
is derivable, as 

INDEX(MAX(s), s) < rep$Size(s) = T 


_ 5. (4!) stands for ‘there exists a unique j such that.’ Mach pee 


- 203 - 


is derivable from the specifications of MAX and INDEX, and 
INDEX(MIN(s), s) < rep$Size(s).= T 
is derivable from the specifications of MIN and INDEX. Note that CHOOSE” preserves — 
the equivalence relation Eqy. 
The implementation of Set-Int in Figure 5.1 with: CHOOSE replaced by 
CHOOSE” is also correct. 


5.3.6 Pseudo-Nondeterministic Procedures 


A pseudo-nondeterministic procedure (which could be either deterministic or 
nondeterministic) is not required to preserve the equivalence relation | Eqv.® The 
correctness proof in this case also is carried as above depending on whether the procedure 
is deterministic or nondeterministic. However, we must ensure that if the procedure 
terminates normally for any input X, then it must do so for all input equivalent to X, and if 
it signals on an input X, then it must signal equivalent exceptions for all input equivalent to 
X. This ensures that a pseudo-nondeterministic procedure does not have a choice of 
signalling as well as terminating normally on equivalent rep values. 

We now take the implementation of Set-Int in Figure 5.1. CHOOSE is 
deterministic; it returns the bottom element of the nonempty sequence. Egqv is not 
preserved by CHOOSE. If the axiom ‘Choose(s) € s = T’ is derivable under the condition 
that ‘Size(s) ¥ 0,’ then this implementation is also correct. The proof of the axiom is 
straightforward: If ‘rep$Size(s) > 0° holds, then | 

HAS(s, CHOOSE(s)) = T = HAS(s, Bottom(s)) = T 

When an implementation does not have any pseudo-nondeterministic procedures, 
then the interpretation of = in I is the largest equivalence relation preserved by the 
procedures. However, a weaker equivalence relation preserved by the procedures may 


suffice to show that the restrictions and axioms of S hold in I. 


6. For example, a procedure CHOOSE” which nondeterministically picks between the top (last) and the 
bottom (first) element of the sequence is nondeterministic and docs not pes the ecaace relation Eqv. 
So, CHOOSE” is also pscudo-nondcterministic. 


- 204 - 


Though the designer of an implementation usually has an idea of what the 
observable equivalence relation is, sometimes it may not bé known. In that case, we will 
not know what procedures are psetido-nondeterministic. Then, we choose an equivalence 
relation preserved by the procedures implementing the deterministic operations, and using 
it as the interpretation of =, we attempt to show ttiat every axiom as interpreted in I is 
derivable. If successful, the implementation I is correct; otherwise, a stronger equivalence 
relation is chosen and the above process is repeated. If the correctness of I cannot be 
established even when. the strongest equivalence relation preserved by the procedures 
implementing the deterministic operations is chosen, then Ti is incorrect. 

Another way to view the above correctness method i is to consider the. specification 
of the procedures in an implementation I as, axioms of the theory of I, defining the 
functions computed by the procedures, and show that every nonlogical axiom of Th(S) is in 
the theory of I. The theory of T also includes the theories of the types.used in I. Nakajima 
et al [62] take a similar view. 


- 205 - 


5.4 Recursive and Mutually Recursive ak 


Def. 5.3. An implementation I of D i i on a data type D’ iff aly if 
(i) D’ is used in I , or 
(ii) a data type D” used in I dependson D’. 8 


In Def. 5.3 above, it is assumed that data types other than D are abstractly used in 
an implementation I of D. In the-correctness ‘method -diseussed in the previous: two 
sections, we have assumed that 

(i) an implementation I of D does not depend on D, and 

(ii) an implementation of a data type D’ used in I does not depend on.D. 
We relax these constraints. We call an implementation I of D recursive’if and, only if the 
rep used in I depends on dD. We call an implementation I of D and another 


Buvig i 


implementation I of D’ mutually re recursive if and only. if I de ands ‘on D and I ‘depends on 


D. We assume that recursion is not due to internal types ised in I. It should be riofed that 
if the implementations of a set of data types are mutually. recursive,: that does not mean that 
data types are also mutually recursive (mutually recursive data types are discussed in 
Section 2.4). We first discuss how the method proposed in Section'S:3 be-modified to deal 


with recursive implementation, later we consider mutually recursive implementation. = 
5.4.1 Recursive Implementations 


In proving correctness of a recursive implementation,. we’ consider a reference to 


Figure 5.9. An Uninteresting Recursive a ale of D- 


D = cluster is OP, , OP, ... 
rep = D 
OP, = proc(...) returns... 
return (DSOP, (...)) 
end OP, 


= 206 - 


D in I as a reference to its rep and an invocation of an operation @ of D as a call to the 
procedure OP implementing o. The equate defining the rep inside I is considered as a 
recursive domain equation, as the construction of the rep depends on D itself. For 


Figure 5.10. Implementation of List-Int 


LIST-INT = cluster is NIL, CONS, CAR, CDR, IS_IN, ISLEMPTY 


rep = oncof [ empty: Null, pair: Pair] 
Pair = struct { car: Int, cdr: List-Int] 


NIL = proc() returns (evt) 
-Feturn (rep$make_cmpty(nil)) 
end NIL 


CONS = procti: Int, I: List-Int) returns (cvt) 
‘return (rep$make_pair(Pair${car:i, cdr:1})) 
end CONS 


CAR = proc(I: cvt) returns (Int) signals (empty) 
tagcase | 
tag pair (p: Pair): returm (p.car) 
pa empty: signal enptyQ 


end CAR 


. CDR = proc(I: cvt) returns (List-Int) signals (empty) 
tagcase | 

tag pair (p: Pair): return (p.cdr) 

beh empty: signal empty() 


end CDR 


IS_IN = proc(i: Int, I: cvt) returns (Bool) 
tagcase | 
tag pair (p: pair): if p.car = i then return (true) 
else return (List-Int$is_in(i, p.cdr)) end 
tag empty: return (false) 
end 
end JS_IN 


IS_EMPTY = proc(I: cvt) returns (Bool) 
return (rep$is_cmpty(])) 
end IS_EMPTY 


- 207 - 


example, consider the implementation of a data type list of integers, denoted by List-Int, 
given in Figure 5.10; its rep is a recursive domain equation. A recursive domain equation 
can be solved by defining an ordering on type algebras and using Kleene’s Recursion 
Theorem. The rep is the least fixed point solution of the equation Gee By for details about 
such an ordering). 

For a correct implementation I, the type algeoci: of the rep should have a 
nonempty principal domain. This property is trivially ensured if rep is nonrecursive. For 
some recursive implementation such as the one given in Figure 5.9; the least fixed point: is 
the empty algebra, an algebra having no domain and no functions. For well founded rep 
equates such as in case of List-Int, the algebras are nonempty. Jf. the rep can be proved to 
be nonempty, the method proposed in the previous section: scan-be: used: ‘Fhe: proof that thre 
least fixed point of a domain equation defining the rep is nonempty is the only additional 
step in proving the correctness of a recursive implementation. 

Figure 5.11 has specifications of the procedures in the implementation of List 
(The specifications of Null, Struct [1,:D nD}, and One-of {n,: D,, prramg D J are 
given in Appendix IV. ) Figure 5.12 isa shee ee of List-Int. We. give: ‘below. a ‘sketch 
of various steps in the correctness Be PTOO! of nite anplementauon eg aasiiboas given in 
Figure 5.10. 


Figure 5.11. Translation of the Procedures of List-Int 
rep. = oneof [ empty: Null, pair: Pair]. | 
Pair = struct {‘car: int, cdr: List-Int] 

~ NILO 4 rep$make_empty(nit) 
‘CONS(i, 1) 2 rep$make_pairtPair${ear: i, cd: 1}) 


CARI) rep$is_pair(l) => Pair$get_car(rep$value_pair(t) 
~ rep$is_pair(!) => signal(empty) _ 


IW 


Q rep$is_pair(!) => Pair$get_ edr(rep$value_palri) & 


~ rep$is_pair(i) => signal(empty) 


_ 1S_ING, 1) 4 rep$is_pair(l) => (i = Pair$get': pate ras Sa 
IS_IN(i, Pair$get_ cdr(rep$value_pair(I)) ff :.. 
~ rep$is_pair(i) => false 


IS_EMPTY (1) a rep$is_empty(i) 


CDR(I) 


Figure 5.12. Specification of List-Int 


Operations 
Nil : — List-int 
’ Cons : Int X List-Int — List-int 
Car : List-Int — Int 
— empty () 

Cdr : List-Int —  List-int 

. — empty () 
is_in : Int X List-Int — Bool 


.4s-Empty: List-int — Beol 

Restrictions . 

- Is-Empty (1) => Car{!) signals empty () 

ts-Empty.(l) => Cdr(i) signals empty () | 

Axioms 

Car(Cons(i, 1!) = i 

CortCons(t, D)-= | 

is-in (i, Nil) we F 

ts-in(i1, Cons(i2, ) = if i1 = .i2 then T else Is-In 1,0. 


_is-Empty(Ni) = T 
ies NaF 


(i) the least. fixed point of the recursive. domain equation is nonempty. For any model of 
Int, the approximations to the rep an Sees J 

(ii) Inv(s) is T. —— 

(iii) The termination of procedures other than ISIN | is ‘obvious, assuining. that, the 
tagcase, and the operations of one-of terminate. For IS..JN,. 4 ,can. prove termination 
using McCarthy and Cartwright's approach, of, .by causing: the fact. that, the rep is. well 
founded with respect to the ordering, /< one-of {pair fear: ‘iedr: 4} far any iand L - 

(iv) the equivalence relation on the rep is the identity relation = | 
the -Festriction ¢ component does 


Fars Ey i tas" 


(v) The procedures return normally on’an input: me 
not specify the corresponding eet fo ee 

(vi) Every restriction is derivable. ice 

(vii) Every axiom is derivable. 


- 209 - 


5.4.2 Mutually Recursive Implementations 


We prove the correctness of mutually recursive implementations in.a way similar 
as in case of a recursive implementation. The. correctness. of. mutually recursive 
implementations must be proved together. The reps of the two implementations are 
Specified as mutually recursive domain equations, the. solution of these equations are. the 
least fixed points, which serve. as the rep of D and the rep of D'. For the implementations I 
and I’ to be correct, both reps must be nonempty. The sest of the proof is. same as in case.of 
nonrecursive implementations with the exception that, the correctness proof for all mutually 
recursive implementations is done together. The implementations I and I’ have to be 
shown to satisfy the restrictions and axioms in S and S’. The invocation of an operation of 
D’ in J is considered as a call to the procedure ina? implementing fhe operation; ahd: the — 
invocation of an operation of D in I’ is concerted as a cll to the procedure in I 
implementing the operation. os ee as 

The correctness proof ‘cannot be hierarchi cally structured ‘i ‘in case of mutually 
recursive implementations, because their correctness fias tobe proved together. For this 
reason, we do not recommend that hierarchically structured (nionrecursive) data types be 
implemented mutually recursively. However, for’ a set of mutually’ Tecursive data types, 
their implementations have to be proved ‘correct together, ‘so these data’ type can be 
implemented mutually recursively without avin to the complexity of the correctness 
proof. — . 


- 210 - 


6. Conclusions 


We have presented a rigorous framework for abstract data types, and studied four 
important aspects of abstract data types, namely definition, specification, theory, and 
implementation correctness, within this framework. An overview of the approach ‘taken in 
studying these issues is given in Chapter 1. The framework has provided a base from 
which to to ask many interesting and important questions about data types. Some of these 
questions have been answered in the thesis, while others’ suggest directions for further 
research. Below, we first summarize the contributions of our work and then indicate areas 


where further work is required. 


» 


6.1 Summary of Contributions 


We have made a clear distinction between a data type and its specification(s) in 
our research. The behavioral approach for defining a data type, developed in the thesis — 
embodies the. view of a data type taken in programming languages. It considers only the 
input-output behavior of the operations. It abstracts from the representational structure of. 
the values and the operations of a data type as well, as from multiple representations of 
values for a particular representational structure. Qur definitional method can handle data 
types with nondeterministic operations -and with operations _ exhibiting -exceptional 
behavior. It is independent of specification methods for data types. Specification , 
languages other than the one proposed in the thesis can also be developed based on it. It 
can be used to give the semantics of existing specification languages. In [43], we have 
‘Studied and compared the expressive power of various specification languages for data . 
types. Using the definitional method, we have been able to characterize computability over 
the values of a data type, and study the expressive power of the operation set of different 
designs of a data type [42]. 

The specification language proposed in the thesis is structured and flexible. The 
normal behavior and the exceptional behavior of the operations are specified separately. 
The language provides mechanisms to specify (i) nondeterministic operations, (ii) 


preconditions for operations stating what portion of the input domain of an operation is 


-211- 


interesting, (iii) exceptions which must be signalted by the operations, and (iv) exceptions 
which the operations can optionally signal. In designing the specification language, one of 
the goals has been to facilitate writing specifications as well as proving properties of data 
types from their specifications without having to ‘express the properties that can ‘be. 
‘deduced. The semantics of a specification is given as a set of data types. Equivalence 
among specifications is defined. 

We have proposed a dedultive system for abstract data types and studied its 
different components. A first order theory of a data type is defined, which i is constructed. 
“from its specification using the deductive system. The well defi nedness, sufficient 
completeness and completeness properties of a specification are defi ned based on what can 
be deduced from it. These properties are related to the model theoretic properties of a 
specifi cation. A clear distinction i is made between the ‘model theoretic and proof theoretic 
properties ofa specification. 

We propose a correctness criterion for an implementation of a data type with 
respect ‘to its specification, independent of implementation correctness. methods and 
specification methods. Many implementation correctness ‘methods. can ‘be developed 
embodying this criterion. We develop a correctness s method which i is simple and natural 
for a wide class of specifications. 

Throughout this research, we have emphasized ‘modularity and hierarchical 
structure, be it the definition, specification, deductive system, or ‘implementation of a data 
type. as | | 

7 The development of the framework has also provided useful insights into data 
type behavior and the programming language features, such as the advantage of having a | 
protected encapsulation mechanism for ‘implementing a data type, separation of the 
“exception handlers from the type behavior, significance of hierarchical structure and 


modularity, etc. 


-212- 


6.2 Directions for Further Research 


We first discuss topics of further research emerging from the discussion i in various 
chapters. We later discuss other aspects of data type behavior not studied i in the thesis, and 
finally, the topics in which the assumptions made about data type behavior i in the thesis are 
relaxed. | | | 

We have not investigated how easily the deductive system proposed in Chapter 4 
can be automated or incorporated into an existing automatic data type deduction system 
such as AFFIRM. We do not anticipate any major problems in incorporating the 
subsystem for reasoning about the exceptional behavior of a data type, because the axioms 
describing the exceptional behavior are similar to equations and can be transformed to 
rewrite rules. However, the subsystem for reasoning about nondeterministic operations 
involves axioms using existential quantifiers. A verification system based on first order 
predicate calculus can in principle incorporate this subsystem. We feel that the full power 
of first order predicate calculus with its ‘complexity. is not required, An approach for 
untransformed axioms (in which properties are expressed using nondeterministic symibols) 
similar to rewrite rules for equational axioms needs to be investigated. . 

The implementation correctness method | discussed in Chapter 5 uses an 
equivalence relation on the values of the rep (representing: type) and requires. that the 
implementation be extended to include the definitions of auxiliary. functions used in a 
specification, if any. It would be useful to develop a method that can derive this : 
information from the specifi cation and the implementation. We do hot anticipate any 
problems i in automating the remaining steps of the method; however, the interface between 

a verification system embodying proof rules for control structures and a data ‘type 
deduction system may need to be analyzed. We are investigating another method that does 
not require the equivalence relation and the definitions of auxiliary functions for an 
implementation. It is based on the behavioral equivalence relation on models: For every 
computation having an observer as its outermost operation, if the specification prescribes a 
result, a value returned by the computation when interpreted in the implementation must 
be one of the possible results prescribed by the specification. 

The proposed implementation correctness method tells whether an 


- 213 - 


implementation is correct with respect to a specification. It would be interesting to extend 
it so that the bug(s) in a incorrect ‘implementation can be located: this would help in 
debugging the implementation. . 

Another complimentary area: for further study is that of systematic testing for 
enhancing confidence in a piece of software. In addition to using it for testing programs 
using the data type, a specification of a data type cafi ‘be ‘used to design a set of test cases for 
checking the implementations of the data type. Garinon’ et al. [19] discuss a system in 
which a specification of a data type as a set of cdnditional equations is presented along with. 
a set of test cases which can be executed using the impfementation to test for the 
consistency of the implementation with the specification: A methodotogy for designing an 
‘adequate’ set of test cases from a specification would be very useful for stich a system. a 

Specifications are often Hard to write: and’ especially the writing of an ‘algebraic’ 
specification has been found to be hard [41, 3]. We are investigating i method for writing a 
specification in a systematic mianiter:’ usirig ‘this method, we hiave~been able ‘to. write 
specifications of data types such as traversable staék [4¥];"file [42], etc. A system that 
embodies such a method and helps a’ designer’ in’ Writing a specification would be very — 
useful. It should assist the designer in analyzing a specification so’ as to enhance his 
confidence in the specification: It should’ check for general structurat’ properties of a 
specification such as well definedness and conipleteness; which ensure proper relations’ 
among different components of the specification. ‘The undecidability of completeness and 
well definedness can be sown by reducing them to the Post'Corresponidence problem {58] 
in Post systems. However, sufficient conditions on axioms and restrictions which guarantee 
well definednéss and: completeness ofa specification néed'to be investigated. “The results of 
Guttag and ‘Horning’ Ps} ‘and Fait 67] will probably be ae in arriding at these 
conditions. 

It is equally aucete to ensure that a ee ane the intent of 
the designer. This can be checked ‘in severat' ways, some of which are complimentary: The’ 
designer can express additional properties that a data type should satisfy. He then attempts 
to prove these properties from its specification tising thé deductive system. Another 
approach is for the designer to come up with a model of the’data type and then check that 


- 214- 


the axioms and restrictions hold in that model. Third approach can. be similar to program 
testing; the specification can be validated on a set of test cases. . 

Guttag and Horning [32] have suggested how formal specifications can be used as 
a tool for designing software. Our specification language can. be used to aid the design of 
the data component of software. For. it to be used for writing specifications of general 
software, it must be extended to include mechanisms for specifying mutable behavior, 
procedural abstractions, other control abstractions, ete, ; 

An important aspect of data types not studied in our framework is the 
relationships among different data types. One. important relatignship is among the set of 
data types defined by a type scheme (also called a-parameterized type). Data types in the 
set defined by a type scheme have similar behavior except that the values of these data 
types may have their constituents belonging to different types, and the values may have 
different structural constraints, for example, different upper bounds on the size of the — 
values, etc. This variation. in the behavior of different types is expressed using two kinds of 
parameters: Constant parameters ranging over. the. values. of a data type, often used to 
express the structural constraints on the values, such as bounds on the size of the values, 
and (ype parameters stating the type of the constituents of the values, For example, a type 
scheme Stk[n : Int, 1: Types] defines a set of data types that havg the behavior of stacks, 
and that differ in the type of the elements. of stacks and the upper bound. on the size of 
stacks. Types stands for the set of all data types, and is itself not a data type. The data type 
Stk-Int-100, for example, is an instance of the seo oe n= 100; and 
‘= Int. 

‘A type. scheme i is in. paca a partiok function ca the cartesian product of the 
domains of its parameters to the set of all types, Types. Fora particular set of parameters, 
this function either returns a data type or is undefined. For examples the type scheme Stk 
Hower if parnineters an wipe sheielat are sealed cae certain properties, then: the 
function returns, a data type only if the parameters. satisfy the desired. properties. For 
example, in case of the type scheme Set{/ : Types}. its, type parameter must have an equal 
operation with standard properties. 


-215- 


The specification language proposed in Chapter 3 can be easily extended to 
specify type schemes. A specification should have an: additional component, called 
Requires, stating conditions on the. parameters ranging over types. The Requires 
component can specify both the operations that the 'type ‘parameter must have and their 
properties. The semantics of such a specification can be easily’ given. How the deductive 
system proposed’ in Chapter 4 can be extended to type schenia would need’ further 
investigation. ee ae os | 

Apart from a type schemé, thete are other interesting: relations among different 
data types. There are standard mathematical relations, such as the relation between a 
cartesian product of data types and its components; the relation between discriminated 
unions and its components; etc. Some of these relations can be expressed as type schema. 
The notion of a subtype of a type needs investigation. For example, what relations exist 
between integers, rationals, and algebraic reals? How do sets, multisets, ordered sets, and 
sequences relate, and how do stacks and traversable stacks relate? 

Our framework is limited in three respects. Firstly, the definition of a data type 
only incorporates the input-output behavior of its operations. It does not consider another 
aspect of the operations, namely how efficiently these operations can be performed. It is 
not even clear whether the computational complexity of the operations should be included 
in a definition of a data type, or whether it is an orthogonal constraint on the 
implementations that should be included in a specification. We think that the input-output 
behavior of the operations of a data type should be kept separate from their computational 
complexity, but a specification should have another component stating the performance 
requirements on the implementations of the operations. 

Secondly, we have assumed a simple model of nondeterminism in analyzing the 
input-output behavior of the operations. For an input on which a nondeterministic 
operation can return many possible results, we have not considered how these results are 
scheduled. It would be interesting to incorporate the scheduling information and extend 
the definitions of observable behavior and distinguishability of values. It would also be 
interesting to investigate how our formalism is affected if we relax the assumption that a 


nondeterministic operation cannot have the choice of signalling as well as terminating 


- 216 - 


normally on a particular input. 

Thirdly, the definitional method handles only. immutable data types. As is. 
discussed in Appendix I, for a wide class.of mutable data types,.the states of their objects . 
can be modeled as.the values of an immutable data type. However, the framework needs-to 
be extended to handle arbitrary mutable data types including data types having objects 
whose state is also mutable, for example, the data type /is/.in MACLISP.. The specification 
language and a deductive system based on the extended framework need to be developed. 
Berzins's work [3] can be:useful in studying this extension. 


- 217 - 


References — 


1. Preliminary ADA Reference Manual and Rationale. SIGPLAN Notices 
Vol. 14 No. 6, June, 1979. ees 


2. Berzins, V. Personal Communication. Lab. for ompules Science, MIT, 
Dec., 1976. ee ee 


3. Berzins, V. Abstract Model Specification for Data Abstractions. 
LCS-TR-221, Lab. for Computer Science, MIT; MA: “1979. 


4. Birkhoff, G., Lipson, J.D. Heterogeneous Algebras. Journal of 
Combinatorial Theory Vol. 8, 1970, pp. 115-133. 


5. Brand, D., Daringer, J.A., Joyner, W.H. Completeness of Conditional 
Reductions. IBM Research Report RC7404, Yorktown ‘Heights, ie York, 
Dec.; 1978. 


6. Burstall, R.M. Proving Properties of Programs by Structural Induction. 
Computer Journal Vol. 12, Feb., 1969, Pp. 41-48. 


7. Burstall, R. M., Goguen, J.A. baedae Theories opettiee To Make © 
Specifications. Invited Paper at the Fifth International Joint Conf. on ped icial, 
Intelligence Cambridge, MA, \, Aug., melee 

8. Cartwright, R, and McCarthy, J. Recursive Programs as Functions in a First 
Order Theory. Report No. STAN-€S-79-711, Stanford University, 
March, 1979. 


9. Cohen, J. yond sterministic Algorithms. Computing Be vole dl No. 2, 
June, ule PP. ead 


10. Cohn, P.M. Universal Algebra. Harper and Row, New. ‘York, 1965. 


11. Dahl, O.-J., Nygaard, K., Myhrhaug, B, The Simula 67 Common Base 
Language. ‘Norwegitin Computing Center, ‘Forskningsvein 1B, Oslo, 1968. 


12. Dijkstra, EW. Notes on Structured Programthing. In Siructured 
Programming (Dahl, O.-J., Dijkstra, E.W., Hoare, C.A.RY; "Academic Press, 
London and New York, 1972, pp. 1-81. 


+218 


13. Dijkstra, EW. A Discipline of Programming. Prentice Hall, Englewood 
Cliffs, NJ, 1976. 


14. Dershowitz, N., and Manna, Z. Proving Termination. with Multiset. 
Ordering. Comm, ACM Vol. 22 No. 8, Aug.,-1979, pp. 465-476. 


15. Ehrig, H., Krecwski, H., Padawitz, P. Stepwise Specification and . 
Implementation of Abstract Data Types. Proceedings of the Fifth International 
Collq. on Automata, Language, and Programming. Udine, as.Lecture Notes in 
Computer Science Vol.-62, Springer-Verlag, 1978, pp. 205-206.. ae 


16. Enderton,H.B. A Mathematical. Tninductionstod apie: Academic Press, 
New York and London, 1972. ; - 


17. Floyd, R.W,. Assigning Meanings to Brograms, Pr ing. 

Symposium in Applied Math, Nal. 19, as Aspects of Computer 25 
Science (ed. Schwartz, J.T.), American Mathematical Society, Providence, R.L, 
1967, Pp. 19-32. . 


18, Friedman; D. P., Wise, D.S. CONS should not Evaluate, its Arguments, 
Technical Report No. 44, Sense Science Pert. indiana alee, 
Nov., 1975. 


19. Gannon, J., McMullin, P., Hamlet, R. Ardis, M. “Testing Traversable. 
Stack. SIGPLAN Notices Vol. 15 No. 1, Jan:, 1980, pp. 58-65. 


20. Goguen, JA. Abstract Errors for Abstrast Data: Types, Proceedings of the. : 
IFIP Working Conference on Formal Basis of Programming Concepis Nol. 2, 
Aug., 1977, pp. 21.1-21.32. 


~ 21, Goguen, J.A., and Tardo, JJ. An Introduction to OBI : A Language: for 
Writing and Testing Formal Algebraic Program Specifications. Nemes 


IEEE Conf. on Specifications of Reliable Software, Ganabridge, May + 
April, 1979, pp. 170-189. 


22. Goguen, LA. Thatcher, J.W., Wagner, EG,. Wrights.B. ‘abseiact Data 
Types as Initial Algebras and Correctness of Data Representations. 

Proceedings, Conference on Computer Graphics. iachiah Beroeniion and:Data: : 
Structure, May, 1975, pp. 89-93. 


- 219 - 


23. Goguen, J.A., Thatcher, J.W., Wagner, EG. Initial Algebra Approach to 
the Specification, Correctness, and Implementation of Abstract Data Types. In 
Current Trends in Programming Methodology, Vol. \V, Data Structuring, hed 
Yeh, R.T.), Prentice Hall, Englewood Cliffs, acd 78. 


24. Goodenough, J.B. ee Handling: issue and-A Proposed. Notation: 
Comm. ACM Vol. 18 No. 2, Dec., 1975, PP. aa One, 


25. G uttag, J.V. The Specification and Applicatignto Sanouanoe: 
Abstract Data TESS Ph. D. Thesis, ey of Voronto, cae 59, 1975. 


26. - Guttag, J. Vv. Abstract +t Date: Types: and the Development oP ocakag Structures, 
Comm. ACM Vol. 20 No. 6, June, 1977, pp. 396-404. area 


27. Guttag, J.V., Horowitz, E., Musser, D.R. The Design .of Data Type. 
Specification. In Current Trends in Programming: Mei hodolagy; Vol: IV, Data.” 
See ne et Yeh, re Prentice Hall, Englewood Ne ae NJ, re 


28. Guiteg, 1, Pofdite, Is The Alpbbirhee: Specification of Abetract, Data. 
Types. Acta Informatica Vol. 10 No. 1, 1978, ‘pp..27-S2.. 


29. Guttag, J.V., Horowitz, fi hae DR. ‘Abstract Data Fypesand = 
Software Validation. Comm. ACM Vol: 21 No/12,:Det;, 1978; pp: 1048-1064. 


30. aase tT V.: cena Cominilrceation, bays ital < 


31. Guttag, J.V.. Notes: on Type Ahédiastion: ABER Trans. con al 
Engineering Vol. SE-6 No. 1, Jan.,1980, pp. 13-23. 


32. Guttag, J.V., Horning, J.J. Formal Specification as a oe Tool. 
Proceedings of the Seventh ACM eae — canines a 
Languages, \.as: ‘Wepas; ‘Nevada, Jan:;.1980: :. 


33. Gume) J.V. Personal Communication, Jan., 1980. 
34. Harel, D., Pratt, V.R. Ccaitomon Progetto N Verification, In. Research 


Directions in Software Technology (ed. Wegner, P.), M.1.T. Press; ‘Caen bridge, 
MA, 1979, PP. 387-391. 


-220- 


35. Hewitt, C. Personal Communication. Lab. for sri et Science, MIT, 
Dec., 1978. 


36. Hoare, C.A.R. Procedures and Parameters: An Axiomatic Approach. In 
Symposium on Semantics of Algorithmic Languages, (ed. Engeler, E.) as Lecture 
Notes in Mathematics, No. 188, Springer Verlag, 1971, pp. 102-115. 


37. Hoare, C.A.R. Proof of Correctness of Data Representations. Acta 
Informatica Vol. 1, No. 4, 1972, PP. 271-281. 


38. Hoare, C.A.R. Notes on Data Structuring. In Structured Pacumiine 
(Dahl, O.-J., Dijkstra, E'W., Hoare, C.A.R.), Academic Press,’ ‘London and New. 
York, 1972, pp. 83-174. 


39. Hoare, C.A.R. Recursive Data Structures. {ntl Journal of Computer and . 
Information Sciences Vol. 4 No. 2, June, 1975; pp. 105-132. 


40. Kapur, D. Proving Correctness of Implementation of a Data Abstraction 
Using the Algebraic Method. Unpublished Handout, M. LT. AAU 6.891 
Specification Techniques, Nov., 1975. : 


41. Kapur, D. Specifications of Majster’s Traversable Stack and Veloso’s 
Traversable Stack. S{GPLAN Notices Vol. 14 No. 5; May, 1979, pp. 46-53.. 


42. Kapur, D., Srivas, M.K. Expressiveness of the Operation Set of A Data 
Abstraction. Proceedings of the Seventh ACM Symposium on Principles of 
Programming Languages, Las Vegas, Nevada, Jan., 1980. An expanded version 
appeared as Computation StructuresGroup Memo 179-1,-Lab. for Computer — 
Science, MIT, Jan., 1980. 


43. Kapur, D. The Expressive Power of Algebraic Languages for Specifying 
Abstract Data Types. Draft Manuscript, Lab. for-Computer Science, MIT, . 
June, 1979. 


44. Knuth, D.E., Bendix, P.B. Simple Word Problems in Universal Algebra. 
In Computational Problems in Abstract eee < bain J ig een he 
1970, pp. 263-297. . 


45. Lampson, B.W., Horning, J.J., London, R.L., Mitchell, J.G., Popek, G.L. 
Report-on the Programming Language Euclid. S/GPLAN Notices Vol. 12 
No. 2, Feb., 1977. 


2 - 


46. Levin, R. Program Structures for Exceptional Condition Handling. Ph.D. 
Thesis, Dept. of Computer Science, Carnegie-Mellon University, June, 1977. 


47. Liskov, B.H., Zilles, S.N. Specification Techniques for Data Abstractions. 
IEEE Trans. on Software Engg, Vol. SE-1 No. 1,'1975, pp. 7-19. - 


48. Liskov, B.H., Berzins, V. An Appraisal of Program Specifications. 
Computatien Structures Group Memo 141-1, Lab. for Computer Science, MIT, 
Jan., 1977. Also in Research Directions in Softwate'Téchaology (ed. Wegner, 
P.), M.1.T. Press, Cambridge, MA, 1979, Pp. 276- 301. 


49. iikey: B. H., Snyder, A., divine R.. SciaBensC. Aosinteson : 
Mechanisms in CLU. Comm. ACM Vol. 20:No. 8, Aug., 1977, pp. (564-576. 


50. Liskov, B.H., Snyder, A.S. Exception Handling In CLU. IEEE Trans. on 
Software Engg. Vot. SE-5 No. 6, Nov., 1979; pp: SAMSS7. 


51. Liskov, B.H. Modular Program Construction Using Abstractiod. 
Computation Structures ae Memo 184, _ ieee se metenee, MIT, 
Sept., 1979. 


52. Liskov, B.H. et al. CLU Reference Manual. MIT-LCS-TR- 225, Lab. for 
Computer Scene, MIT, Oct., 1979. ce 


53. Majster, M.E. Limits of the Algebraic Specification of Abstract Data 
Types. SIGPLAN Notices Vol. 12 No. 10, Oct, 1977, adie 


54. Manna, Z. Mathematical Theory of Computation McGraw Hill, 
Computer Science Series; 1974.” 


55. Manna, Z. Six Lectures on the Logic of Computer Programming: Stanford 
A.I. Laboratory AIM-318, Nov., 1978. 


56. McCarthy, J. Towards a Mathematical Sciphee of Computation. 
Proceedings /F1P Congress, 1962, pp. 27- 28. 


57. McCarthy, J. A Basis for a Mathematical Theory of Computation. In 
Computer Programming and Farmal Systems (eds. Braffort and. Hirschberg), 
North Holland Publishing Co., Amsterdam-London, 1963; pp..33-70.: . 


-222- 


58. Minsky, M. Computation: Finite and infinite Machines. Prentice Hall, 
Englewood. Cliffs, NJ, 1967. . 


59. Morris, J.H., Jr. Types Are Not Sets. Proceedings ofthe First ACM 
Symposium on Principtes of Programming eee Boston, Oct., 1973, 
pp. 120-124. 


60. Musser, D. R. Abstract Data Types i in the AFFIRM: M Syste. JEEE Trans. 
_ on Saftware Engg. Vol. SE-6 No. 1, Jan., 1980; pp, 24-31... 


61. Musser, D.R. Proving Inductive Prepares of Abstract Data Types. 
Proceedings .of:the Seventh ACM. isla comcnile OM: felecmal <A siiesteuaas . 
Languages, Las Végas, Nevada, Jan; 1980. * ' a 


62, Nakajima R., Nakahara, H., Honda, M. Hiesarchical Program . _ 
Specification and Veriftcation --A: Many Sorted. Logical Approach. - Preprint 
RIMS 265, Nov., 1978. 


63. : Nourani, F. Gonanisie Extension and eeneudin's of Abas: Data 
Types and Algorithms. Ph.D. Thesis, Dept. of Computer Science, University of 
California, Los oktihia June, 1979. 


64. Okrent, H. F. Synthesis of Data Structured fom Allpbbtaic Descriptions. 
Ph.D. Thesis, Dept. of EE. & CS., MIT, bis me 


65. Palme, J. Protected Program Modules in Simula 67. National Defense - 
Research Institute, Stockholm, ewecen, duly, dia 


66. Parnas, D.L. infonmagon Distribution ‘Aspedte otRiesian Methiodology:” 
Information Processing 71, Vol. I, North motlane, i eenerom 1972, 
PP. 339-344, 


67. Polajnar, J. An Algebraic View of Protection and Extendibility i in i Abetract 
Data Types. ‘Ph.D: ‘Thesis, Dept. of oe parma teed * nen 
California, Sept., 78. 


68. Srivas, M.K. Preliminary Investigations of a Thesis ‘Fopic.on Automatic 
Synthesis of Abstract Data Types; nate a — for 
Computer Sciénce, MIF; Dec. 1978. * 


- 223 - 


69. Standish, T.A. Data: Structures.-An Axiomatic Approach. Bolt; Boranek, 
and Newman, Inc., Technical Report 2639, Aug., 1973. 


70. Subrahmanyam, P. On.a Finite Axiomatization of the Data Type L. 
SIGPLAN Notices Vol. 13 No. 4, April, 1978, Pp. 80-84. 


71. Thatcher, J.W., Wagner, E.G., Wright, J IW. Data Type Specification: 
Parameterization and the Power-of Spetification Tethnighes:: Proceedings:of 
the Tenth SIGACT Conference, May, 1978. Also an IBM Report RC7757, 
July, 1979. 


72. Wegbreit, B., and Spitzen, J.M. Proving Properties of ee Data 
Structures. JACM Vol. 23 No, 2, April, 1976, Pp. 389-36, -. 


73. Wirth, N. Program Development by Stepwise Refinement Comm. ACM 
Vol. 14'No. 4, April; 4971 ,'pp..221-227. : . 


74, Wulf, W., ‘bondake RL, and Shaw, M. Abstraction and Verification i in 
ALPHARD: Introduction to Language and be Se USC Information 
Sciences Institute Research: i 1976. 


TS. Wulf, W., London, R.L., and Shaw, M. An Introduction to the. 
Construction and Verification of Alphard'’ Programs. TEEE Trans. on Software 
Engg. Vol. SE-2 No. 4, Det., 1976, pp. 253-265. - 


76. Zilles, S.N. Algebraic Specification of Data Types. Project MAC Progress 
Report, 1974, pp. 52-58. Also Computation Structure Group Mémo 119, Lab. 
for Computer Science, MIT, 1974. 


77. Zilles, S.N. An Introduction to Data Algebra. Draft Working Paper, IBM 
San ail Research Lab., aed 1975. 


-224- 


Appendix | - Elaboration of Scope and:Assumptions 


In this appendix, we elaborate: on the scope of the. thesis and the assumptions 
made about abstract data types and their operations. 


1. Immutabte and Mutable Data Types 


We adopt the commonly accepted informal view of a data type as a collection of | - 
objects with a.finite;collection of operations. to manipulate these objects. The objects by 
themselves are not meaningful and the operations: ‘are the only way to coristruct, 
manipulate and observe the objects as well as to extract fp formation stored i in them. 

Data types can be classified based on’ theirabject boha vier, : An abject of a data 
type may or may not exhibit time varying behavior. An object exhibiting time varying 
behavior is called a mutable object, whereas an object whose, behavior does not change is 
called an immutable object [49]. We also catt an. immutable object a. value. A data type 
having only immutable objects is called an immutable data type; otherwise, a data type is 
called a mutable data type.. A. mutable data. type. may. also have immutable objects, | but at 
least one of its objects must be mutable. A mutable Object cam be'factored into two 
components: (i) identity, and (ii) state [47]. A mutable data type has at least one operation 
constructing new objects. Its operations, amay. change, the state, of. a. mutable object without 
affecting the object identity. At a given point in‘a compitation, there can: exist many 
different mutable objects having the same state. For a wide class of mutable data types, the 
state component of the mutable objects can be described as an immutable. data type. - 

' In this thesis, we have considered only immutable data types with a finite set of 
computable operations. We have not considered immutable data types with iterators [49] 


nor data types involving streams and lazy evaluation [18]. _ 


a 225 a 
2. Exceptional Behavior 


During the design and construction of reliable software, there is often a need to © 
have data types with operations exhibiting exceptional behavior. (See [24, 46, 52, 50] for a 
discussion on the need for an exception handling mechanism in a programming language.) 
It is only meaningful to apply such operations on a subset of their domains. If an input 
falls outside the subset, such operations notify their callers indicating that the input is not 
‘good,’ by signalling exceptions. An exception is assumed to. have two. components, a 
descriptive name and a possible set of arguments which carry information from the point 
where the exception is signalled, to its handlers. . , 

We assume that every operation of a data type terminates on every input in its. 
domain: it either terminates normally by returning a value of its range. type or terminates 
by signalling an exception. We think it ts not. a good. practice to design data types having. 
operations that do not terminate on some inputs... If a partial function oa the values of a 
data type needs to be realized, it. can be. programmed ip terms of the operations. of the data 
type in a host programming language supporting the data type mechanism. 

The assumption of the operations, being. fotal. simplifigs 1 the formalism. sevciaped 
in the thesis. Our formalism can be extended to partial operations without much difficulty 
by introducing a special value ‘undefined’ for every data type such that if a partial 
operation is not defined on an input, then it returns ‘undefined? on that input. 


3. Nondeterminism | 


There are data types some of whose operations exhibit nondeterministic behavior. - 
These operations return. one of many possible values for a given input. For example, the 
Choose operation of the data type finite set of integers, which returns any element of a 
given nonempty set, is nondeterministic. Similarly, the Index operation of the data type 
finite sequence of elements, which returns a position of a given element in a given sequence, 
is also nondeterministic because the sequence can have more than one occurrence of the 
same element. All prior work on data types has assumed the operations to be deterministic. 
We feel that a formalism for data types must be capable of handling data types with 


- 226 - 


nondeterministic operations, as nondeterminism is a powerful and elegant abstraction. 
mechanism for designing programs [13,9]. Furthermore, allowing nondeterministic 
operations permits the handling. of data types with operations implemented in a parallel 
environment. . . 

We assume that a nondeterministic operation has only finitely many choices on a 
particular input. We rule out data types having operations ‘with infinitely many choices. 
Such an operation can be used to write programs having unboiinded nondeterminism (13). 
There is a controversy about the the realizability of programming constructs having ° 
unbounded nondeterminism and about the imitation of the expressive power of a language 
that rules out programs with unbounded nondeterminism [35]. Using our formalism, it is 
possible to define a data type whose vilues are ‘infinite’ (e.g., ‘infinite’ sets, ‘infinite’ 
sequences, etc.,) insofir as these values can be finitely constructed using the operations; 
but, nondeterministic operations on these values that hive infinitely many choices are ruled 
out. Our formalism would however extend without much difficulty to the case where the 
constraint that a nondeterministic operation has only nme, raany choices | on an input, i IS 
dropped. ac: 

We also assume that if a nondeterministic operation signals an exception on an 
input, then the operation behaves deterministicatty on the input “Phus a nondeterministic 
operation is not allowed to have a choice between: signalling arid terminating normally on 
any particular input. ‘This assumption leads to a simpler’ and ‘nodutar characterization of 
the observable behavior of the data type than would otherwise be possible. 


- 227 - 


Appendix Il - Definitions of Algebraic Concepts and Proofs of 
Theorems in Chapter 2 


In the first section, we extend the definitions of congruence, homomorphism, and 
isomorphism to extended heterogeneous algebras having ‘nonideterministic functions. In 
the second section, we present the proof of: Theorem 2.2. In the. third section; we explain 
how the Definition 2.12 of behavioral equivalence on typé algebras captures the desired 
property that a computation (i.e., an interpretation of a grotmd térm) results in equivalent 
values in two behaviorally equivalent type algebras. oa aa 


1. Congruence, Homomorphism, and lsomorphism = 


Def. A2.1 A congruence Rona conventional fetereeneois algebra 
A= [{V, 1D ea'} ff, Jo€ ahh. i 
in which each f, is a total deterministic. function, is a family of, equivalence relations 
{R,) | D' € a’ } such that — | | 
for everyo €2,0 : D,X...X D. — D 
Malye Np ests 7e Yp: ede eB _ 
Rp Vig cero ¥; a Rp vi => £(v,.--- 5) Rey fy VD. (*) 
We also a that R has the subsHuilon property. z 
In an extended heterogeneous algebra having nondeterministic functions, when f, 
is a nondeterministic total function, then (*) is madifiedto... 
v.Rp we VRy vi => (vye{f, (Ce WD V32€ (1.00.9) My Rgy 
cee TV) PIELER (Y.. 7) EL Ry 21). | - 
If R,y is the identity relation (equality), then the above reduces to 
. {f, (v,,.--..¥)F = {f, (pps v)}. 1 


Congruences: on’ an extended: heterogeneous. algebra’ A ‘can: ald ‘be ati 
ordered in the same way as in case of a conventional heterogeneous algebra: 
Given two congruences E! and E?, E’ is /arger than E!, expressed as E' < E’, if and only if 
for each D’ € A’, Ey! Cc E)' , | 


= 28- 


Congruences form a lattice with respect to <, and have the least element (the identity | 
congruence) and the greatest element (the universal congruence). 


Def. A2.2 Let A, and A, be 
A, =[{ Vy ID ea} {8 leea}) 
Ay =[{ Vp ID ea} {loc ath 
A family of total (deterministic) functions = {4}, : Vi V5/1D'€a'}. is called a 
homomorphism from A, toA, if 
for.each o : D, x. xD +D, . % 2 
for each v, of type D, (i.e., v, € Vy, > ..oe 9 Of typeD 
(i) if f, lis deterministic, then 5 ? is i deterministic and 
Od Ce yyeP (sy, (v,).. Oy (v)), and’ 
(ii) iff) is nondeterministic, then f 2s either scememinen eeu and 
My LNG. DE = 1G (Oy Oi My ODE BE - 
(Case (ii) above covers case (i) also.) We catt © afi onto homomorphism.from A, to A, if 
every function in ¢ is onto; in that case, A, is called a homomorphic image of’ Ae. “tt every 
function in is a bijection, then @ is an fl ees A. ‘to’ ‘A,, and A, ai A, are 
isomorphic. Note that, if A, and A, are isomorphic nondetirinihistic algebras, then they 
have the same amount of. a eadandielie ‘which is. dot netesadtily the case if A, is a 
homomorphic image of A.. 
It can be shown a the results from conventional heterogeneous algebras in [4] 
extend to the extended See algebras. In particalar, we can show that 


Prop. A2. 1 IfRisa congruence on an extended heterogeneous algebra A, then there exists 
an onto homomorphism from A to AMR. | a 


Prop. A2.2 If © is an onto homomorphism “from: A, to A, then the kernel R of # on A,, 
_ where R = {Rp,| Di € 4’} and Ry. = £<», y>1e,0= 6 ph is.a congruence on A,. 
i : : 


: - 229 - 


The following diagram in which @ is an onto homomorphism from A, to A,,R is 
the kernel of ® on A,, H is the homomorphism induced by R from A, to A,/R, and 9’ is an 
isomorphism from A,/R to A, , commutes, i.e. @ = o'+ H. 


® 
ry ae aa 
H @’ 


AJR 


2. Proof of Theorem 2.2 


Thm. 22 Assuming that EB00L is the largest congruence on a model of Bool, E is the 


largest congruence on A. 


Proof By induction on type algebras. 
Basis: 4 = @, the null set. | 
(i) Bool - the statement holds because of the assumption. oe | | 
(ii) D other than Bool - since every value in Vp! is observably equivalent to every other 


value, the statement is true. 


Inductive Step: A # ®, 

Assume that the statement holds for each D'€ A. : 

To prove the statement for D, we must show that Ey is the largest equivalence relation 
such that E is a congruence on A. We prove this by contradiction. 

Suppose E,) is not the largest equivalence relation and EF’, is a larger, equivalence 
relation containing E, such that E’ = { E, yD cA} U{E) i is a congruence on A. 
There exists <v, v> € FE p Such that <v, v> € E,, So, there is a x09 of type D’ € A such that 
there is an interpretation of c[x/1] in A distinguishable from every interpretation of e{[x/v] 
in A or vice versa. But, this is contradictory to E’ being a congruence which Fequires that 
for every interpretation v, of c{x/y] in A, there i is an interpretation ¥, of cfx/ v]i in 1A such 
that <v,, v> € E),, and vice versa. So, E)) is the largest equivalence relation. 1 


- 230- 


' Modification for type algebras having an exception domain 

The proof has the same structure as:above, except that we also have to candies 
the case when <v, v> € E,, implies that v and v¥ ate distinguishable because 9 computation 
c(x) (i) signals on v and returns a normal value on us or vice versa, or (ii) signals 
distinguishable exceptional values on v and y. In the. basis step, for the case of D other 
than Bool, E,, need not be the universal relation on ¥p. : . 


er. 


3. Elaboration of the Definition of Behayioyal Equivalence and 
Proofs of Theorems 2.5 and 2.6 . 


In Section 2.2, we defined two type algebras to. bé.tihiaviorally’ equivatent if their 
- reduced algebras are isomorphically equivalent. , _ We further elaborate on this definition. 
We prove Theorems 2. 5 and 2.6 of Section 2.2. ‘The ‘discus@it and theorems of this section 
extend to modified type algebras having the exception domain. The set of mappings from 
a modified type algebra A to another modified type afpebra® A” ‘includes a tiapping from 
the exception do:ain of A to the Sonar conan ds A’ et bes defined ay: ‘the 
mappings on the normal domains.” ©" Peay ee 

~*~ Ag is discussed in Subsection 2.25, the Behavioral efivaldic of type algebras A A, 


and A, can be expressed as 


eeeces 
[Pee 
l oof 
| edie 
HY WH, 
| l 
i a) 
cae Cala ca 
e . 
such that the above diagram commutes, ie, 
OH =H} | 


where A,/E, and A JE, are the reduced algebras dbiiecponiding to A, and A, - qeapecuvely ; 


-231- 


and is the isomorphism defined by the isomorphic equivalence of A /E, and A,/E, . The 
equation (+) above defines the set ¥ of miahy to’ many mappings, where 
¥={¥): Vy Vi1 De AULD }}. : 
We first discuss how for two isomorphically equivalent algebras A, and A,, the 
bijection %) in an isomorphism @ can be constructed, and show that the interpretations of a 
ground term e in A, and A, are ‘equivalent.’ Later, we discuss these properties for 
behaviorally equivalent algebras. 


3.1 isomorphically Equivalent Type Algebras _ 


For the case when the deterministic constructors of a data type D can n generate all 


the values of D, we have . 


Thm. A2.1 If A, and A, are isomorphically babe then. 4p |D'e€ aj ‘igsisay 
- determines the bijection %p- ree 


Proof By definition of isomorphic equivalence, there exists a SShedtiont %): Vi ~ Vp 
such that = {%),|D'€A’} is an isomorphism. We prove the statement by 
contradiction. -Let us assume that #,, is not unique; instead, there are two bijections % 
and $j, such that @' = { | Dea} Ut ted }and @ = {4 |Diea}U { op } are 
isomorphisms. ; “8 . 
Since o) aad Fs are different. there exists ve Vv) : )(¥) # 62(v). We pick a v 
that can be constructed by the minimum Mare (say ”) of. applications of the 
deterministic constructors and on which >) and os differ. We have y= fl Rev eres ) for 
some o, and if D = D, v. can be constructed "ye ke < k number of eppiations of 
constructors; thus, @ o5(v) = = o5(v). 
By the definition of isomorphic equivalence, 
o)(¥) = Poy iO). baits 5 (¥), “ante op | (v )), and 
o\() = Coy (v,), ..., 5)... (,)). 

meaning that 619) = = 030, which is contradiction, 

So, there are not any vsuch that po #e ® BO). 

Hence the proof of the theorem: #@ 


- 232- 


We can construct the bijection ,, as follows: 
For every constructor «: D, X ... xD, + D 
(% &) =viA...A %p (,) =v')= o (f(v 2 VY= P(v; ee | 
The case of o's not taking any argument of type D serves as the basis step in the 
construction of ®p- 

The above theorem holds in case A, and A, are , reduced even if some of the 
values of D cannot be constructed without using a senesced constructor. “However, 
it does not hold in general; for example, consider a variation of the type algebra Al, for 

Set-Int denoted by A: , » having everything: efse’ the saiiie as Th'A!. ‘except that In’; the 
interpretation of the operation Insert, is a nondeterministic function, which appends the 
integer being inserted to the beginning of the sequence representing U the given set or at wae 
end of the sequence. 

w({i,...6>.9 9 bela who RSF SIE b 

Cia 1, D Orsi i, > otherwise: © 
A3, is clearly isomorphically scusvaleat to iolf and. cane is more: than one. isomorphism 
from AS, to itself. 


“Thm. A2.2 Given two isomorphically equivalent wpe ‘algebras A, and A, ‘defining an 

isomorphism 4, a value vy of type D in A, has the same obsérvable beheld’ in A’ 15.609) 
in A, i in the sense that for every term c(x) of type D’e (D) with one free vatiable oft type 

Oy (Lele 3) = felx/e pO a, } 
Proof By induction on the depth of x in n e(x). 
~ depth(x) = 0. | | 
depth(o(e,, ... ¢,)) = max(depth(e,), ... depth(e)) +1, 

where e. has x as a variable. 


Basis depth(c(x)) = 0. 
So, e(x) is x, and the statement of the theorem trivially holds. 
Inductive Step Assume the statement of the theorem for the case wheal 
depth(c(x)) < k > 0, to show for the case when depth(e(2) = > k. Let 


- 233 - 


c(x) = ofe,, ..., e), 

where eis of type D.. We assume that the statement holds for each e, so 

oy Cell a 1) = Lele, I 

ey Lx, 1) = Oy 11) Ce /all gd fell, DY) 

= {f a(n (t e[x/v]] A }), ... oy Cf él A, yy} ‘Gince @ is an isomorphism) 

= £0 (Le L700 g, 3. £ e,[x7% 0 A, )} | 

= { ofe,, ..., e )[x/%))( y A, t= { ofx/o,00 A, hu 
For the case of modified type algebras, we are interested in terms that such. that: c[x/ »]] A, 
and ¢[x/, (v)]I A, are not undefined. 


3.2 Behaviorally Equivalent Type Algebras" 


Thm. A2.3 If A, and A, are behaviorally equivalent, 
then <v, ¥> €¥)-=> < fH [V]>E®,. 


Proof Obvious frum the diagram. Since o - H, = +H, *Y, from <y, v>E Ep we get 


od) = 
We now present the proofs of Theorans 2.5 and 2.6 of Subsection 2.2.5. 


Thm. 2.5 For behaviorally equivalent A, and A,, for every: gyound term e of, type 
D"€(D)’, for every velda }, there isa VEL dy} such that <[ v], [y¥ Pe op, and 
vice versa. 
Proof By induction on the structure of type algebras. 

Basis A=B 

(i) D is Bool: Since all behaviorally equivalent algebras are isomorphic and the 
observable equivalence relation is the identity relation, the aboye is true. 

(ii) D is other than Bool: Since the observable cquivalence. relation is the ‘aniverisl 
relation, the above is true. 


I. Inductive Step A # @ 
Assume that the above statement holds for alf ground terms of type D” € (D)* not 


- 234 - 


having any operation symbol in Q. (1) 
To show for a ground term e by induction.on number of operation symbol from @ in e. 
2. The basis step holds because of the assumption. 


2. Inductive Step Assume for e haying k< ae of operation symbols from Q, 
to show for e having k occurrences. : (2) . 
This is also proved by induction on n the depth of the outermost operation symbol 
from Q in e. 
depth(ole,,..€))=0 — ifo€ = 
depth(ofe,, .... e,)) = min(depth(e,), ..., depth(e) +1 ifo €a. 
3. Basis depth(e) = 0,ie., e = ofe,,...,€),and o€ 2,” 
So, an e, can have at most k-1 occurrences of ‘operations. from Q. 
We prove the statement of the theorem in one ‘direction: ‘the e aior in the other 
direction is the same except that vis to be. replaced for y, . toe a yee 24 
Ifve{ da, hie, if[Vle fel, /E, ie there is achoice of she interpretation of 
inA /E,, , such that 
ae ‘eldvh-.[vD. where [y] €{¢ ln, re) for each 1 <i: <a. 
By inductive hypothesis (2), for every Wy € { e] A JE, 3 there i isa tv € { e] A. JE, 7 such 
that @ p, ©) = [v}. Because « @ is an isomorphism, there i isa choice of g? ‘such it 
yA = ‘(¥) = 254) «... [PD meaning that v'e'{ ela, oo 
3. Inductive Step Assume for ¢ having depth¢é) <m’> 0, to show for ehaving — 
depth(e) = m. (3) 
e= ofé,..., e) ofa. ; . 
The proof goes the same way as for the basis step si that we use the madels 
of the data type D’ that has the operation o. 8 


For modified type algebras, we are interésted ‘in’ ground terms whose interpretations are not 
undefined. It can be.shown for behaviorally equivalent type algebras A ‘and A, that if for 
some ground term e, 4 , is undefined, then ef , is also undefined’ind vice versa. § 

1 2 


Thm. 26 For.behaviorally equivalent A, and A,, for any ground terms e, and e, of type 


=230.° 


Di {lela 1}= lela To fleyta 1} = (lela Ih 
Proof From the above two theorems and the fact that A /E, and A,/E, are isomorphically 


equivalent, the statement is immediate. # 


- 23% - 


- Appendix Ill - Proofs of Theorems in Chapter 4 


This appendix contains proofs of various theorems in Chapter 4. 


1. Specifications without Nondeterminism and _ without 
Exceptional Behavior 


Thm. 4.1 Every constructor ground term e of type Set-int’ is equivalent by equational 
reasoning to a ground term e’ not having any occurrence of Remove, i.e., the equation 
‘e=e' € EQ(Set-Int’). 


- Proof For every constructor ground term e of type Set-Int', there is.a constructor ground . 
term ée such that 

(*) ‘e= ec" € EQ(Set-Int’) A #re(e’) = 0, 
where #re(e) gives the number of occurrences of the operation symbol Remove in e. 
Similarly, the function #in gives the number of occurrence of the operation. symbol Insert 
in aterm. We show (*) by induction on #re(e). 


Basis #re(e) = 0, 
The above statement trivially holds, because e’ is same as e. ° 


Inductive Step Assume the statement holds for e such that #re(e)<k, 
show for #re(e) = k. 
Consider the outermost subterm e, in e such that e, Remove(e, ., il). Clearly, 
#refe,,)< k, so there is a subterm an such that e, & en ’ € EQ(Set-Int’) and 
#re(e,,) = 0. Thus we have ‘e, = Remove(e,, , il)’ € EQ(Set-Int’), We show that (*) 


holds for Remove(e, ,, il) by induction on #in(e;, ). 


Basis #in(e’, ) = 0. 
C= Remove(Nuill, il) 
= Null € EQ(Set-Int’) using Axiom 1. 
é is obtained by substituting Null for e, in e. 


297 


Inductive Step Assume the above holds for #in(e,, )<m, 
to show for e,, having m Insert’s. 
€), = Insert(e,,, i2), so 


‘e, = Remove(Insert(e, , i2), il)’ € EQ(Set-Int’). 


2 
There are two cases. 


Casel i1 = 12 


6 


“es Remove(e, ,, il)’ € EQ(Set-Int’). Axiom 2. 
By the inductive step, there is an e,, such that 

‘Remove(e,,, il) =e, € EQ(Set-Int’) and #re(e;, ) = (>. 
So, ‘e, = ¢), -€ EQ(Set-Int’). 


We get e’ by replacing e, by c ; 


Case 2 ~ il = i2 
C= Insert(Remove(e,,, 11), 12)’ € EQ(Set-Int’). Axiom 2. 
By the inductive step, there is ae, such that 
il) = e},’ € EQ(Set-Int’), and thus ‘e, = Insert(e,, , 12)’ € EQ(Set-Int’). 


i2). 


‘Remove(e,,, 


We get e’ by replacing e by Insert(e,, : 


Thm. 4.4 Ifa specification S is sufficiently complete, then S is behaviorally complete. 


Proof IfS is inconsistent, then since RS) = @, so S is trivially behaviorally complete. 

If S is consistent, we show that a sufficiently complete S is also behaviorally complete by 
contradiction. 

Suppose S is not behaviorally complete, so there exists two reduced algebras A, and A, in 
F(S) that are not isomorphically equivalent w.r.t {P, Jo€Q}. Without any loss of 
generality, we can assume that A, and A, share the same domain corresponding to a 
defining type, so for each D’ € A, ,,: is the identity function. Since every constructor is 
deterministic, there is a unique mapping >): Vy> Vi, which can possibly satisfy the 
following for every o in Q. | 


for each set of values v,,..., v., such that P [x,/v,,...,.x/v]l, =T, 
n oo 1} non A, 


(*) (ECV wa V)) = (Op) .+ +p (v)). 


 -28- 


If A, and A, are not isomorphically equivalent w.r.t. { P, |-o € 0 }, this means that there 7 
must exist an observer o and a set of values v,, ..., v_ such ‘that p olX,/ Vs ee x/vil A, 
holds and (*) is not satisfied. : . 
Using the minimality property, we can construct a legal ground term o(e,..., €.) of 
type D’ € A, where D’ is the range of o, and for each 1 <i < n, e is the ground term whose 
interpretation is v, in A,. Since S is sufficiently complete, there exists a. ground term é of 
type D’ not having any operation symbol of D and: auxdiary function used in'S such that 
“ofe,, ee )a e* € EQS). This means that. 
: “He, oa eM A, = P(e, Ma =a a 
because A, and A, are ‘aluced: algebras. This is in contradieeiivn: to(*) not oe satisfied. 
seas ms result. § 


Thm. 4.6 Fora consistent and sufficiently complete S, if any two legal ground terms é, > and 
e, of type D are distinguishable by S, then * e, # e "€ DSS). 


Proof: e, and e, are distinguishable by S, means that for any A € FS), 1 , and e] , are 
distinguishable, i.e., there exists a term e(x) of type D’ €.4 with one free variable. x of type 
D such that c[x/v,} | , is distinguishable from c[x/v,] | , in A. | 

Using the above fact, we prove the theorem by inducpon.< on oe 


Basis Specifi cations with no o defi ining ‘fies 
Case 1 Bool 
‘T £ F € DS(Bool). Every ground term of type Bool is equivalent to either T or F, so 
the theorem holds. 
Case 2 D other than Bool 
_ All ground terms are > observable equivalent, so the theorem holds. 


Inductive Step Assume the spoue statement for the specification s of a data type D’ used in 
the specification S of D. To show for S. 
We can prove by contradiction that ‘ e,#e,€ DS(S) as follows: 
Assume e, = é 


2 
_ then c[x/e] = e[x/e,], 


- 239- 


since S is sufficiently complete, there exists-ground terms e, and e, of type D' such that 
ee) do not have any occurrence of an operation ae of D, and ‘ ee as EQ(S) and 
"@,=6,'€ EQ(S), so we have ° ene € EQ(S).. ‘Since. e, <3 are distinguishable by S’, by 
inductive hypothesis, ‘e, # e," ¢ DS(S’), so ‘e: 2 eis $ also in n DS(S). This is a 
contradiction, as S is consistent. So, ‘e, # e,’ € DS(S), 1 


2. Specifications with Exceptional Behavior and without 
Nondeterminism | eae. ee 


Thm. 49 Every legal constructor ground term ‘e of “type ‘Stk-Int such that 

Nici. Int = T € EQ(Stk-Int), is equivalent . by equational, feasoning to another. legal 
constructor ground term e’ having only Null and Push, ie., if ‘N? PCtk- int = Te 
EQ(Stk-Int), then ‘ e = e'’ € EQ(Stk-Int). 


Proof Proof is similar to that of Theorem 4.1 above. 

Let #po'and #rep be the functions on terms computing number of occurrences of Pop 
and Replace respectively. We show by induction on #po(e) + #rep(e) that | 

(*) if ‘Nc tK. int) = T € EQ(Stk- ees then there exists an e such that ‘e=e’e 
EQ(Stk-Int) and #po(e’) + y= 0. 


Basis #po(e) + #rep(6) = = 0, 


eserves as é’. 


Inductive Step Assume (*) above for the case # pofe) +. #reple)<k, 
to show for #pefe) + #rep(e) = k. . 

Consider. the outermost subterm e, in e having Pop or Replace as the outermost 
operation. It is obvious that if ‘N24. int) = T € teen then NS Int) = Tr 
€ EQ(Stk-Int). pie 

Casel e, = Pople, ) 

Since ‘Neg tk- Inter) = T € EQ(Gtk-Int), by inductive step, there exists an ei such 
that ‘e,, =e), ' € EQ(Stk-Int) and # pote; +: #rep(e), ) = 0. | 

cae N? Stk-Int(e) = T € EQ(Stk-Int), ey is not Null, and $0 €}, = Push(e,,. i). 

Thus * a= = Pop(Push(e ey i= ey) € EQ(Stk-Int) fats 


~ 249 - 


By replacing e, by e,, in e, we get the required ¢’. 


Case 2 e, = Replace(e,,, il) 
Since ‘N? Stk-Int() = =T€ EQ(Stk-Int), by inductive stép, there exists an ei such 
that‘e, = é, ae ‘Int) and # pole! ) + rene; = 0. 
sins N? Pik int(¢)) = Te EQ(Stk- Int), e' , is not Nuil, and so e = Push(e ei i2). 
_ Thus e, = Replace(Push(e,,,i2), il) a # . 
= Push(Pop(Push(e, , i2)), il) Axiom 3 
= Push(e,,, il) Axiom 1 
So'e, = Push(e. i ily € FEQ(Stk-lnt). : 


By reptacing e, ineby Push(e, e,\, im), we get the required 8 


Thm. 4.12 Ifa specificntion S is anny complet then Si iS behaviorally complete 


Proof If S is inconsistent, then since AS) = = @, 0S istrivially behaviorally complete. 

If S is conststent, we show that pees behaviorally. complete. by 
contradiction. _ - ee sad oy 

Suppose S is not- + benavionaily capes 8) o there exists two reduced algebras. ‘A, 

and A, in F{S) such that for every D’ € A, the domain corresponding to D’ in A, and A, are 
defined by isomorphically equivalent algebras in F{S’), where S' fs a speciation oF D, 
and A, is not partially isomorphically embeddable w.r.t. S in A,. Without’ any oss of | 
generality, we can assume-that “A, and:.A;. ieee te qusiuduia mutans: a 
defining type, so for every D' € A, ®,,, is the identity: fignetion. Stmec‘every constructor is 
deterministic, there :are © oe ‘one to one Lama 2 “ me and 
a embeddable in A, (see Def. 3. B of isomorphic cebbikdabitis in 
Section 3.5). The first two requirements there can be easily. satisfied. . The: third 
requirement is complex and is restated below: a oo 

For every operation o € Q, for every set of values: 9; rere Fr such'that oer is defined. 
for each 1 <i-< fn, and Pfx,/¥,, ee heat  F- : 

(a) if f signals an exception value ex(v’, gts yt) specified: to be optional by S on the 


- 241 - 


input v,, ..., v,, then the associated condition O(x,..., x,) holds forv,,..., v,, and 
yp (>) ata %y (»,)) either signals ex(on-(v), cae %p'(,) or returns &,{v) for 
some vy, or 
(b) if #y(v), eae pO) are defined and f signals an exception value 
ex(Oy (V1), ar py ()) specified to be optional:by S on. the input %p (vp), Sooek Oy (¥ )s 
then the associated condition O(X,...5 x) holds for py (vp), basee %p (y,), and 
: mG ee, ) either signals ex(v',.. ree or returns v’; otherwise, | 
(O20...) =P ap OD) C, | 
For A, not to be partially isomorphically embeddable in A,, at least one of the 
above conditions is not satisfied. Supposingly if the condition (a) is not satisfied, we have 
HOy 0) --. Oy OD) F ex(Oy. (0)... yO), | 
meaning that A, does not satisfy the optional. exception condition. for o in S, which is 
contradictory to a assumption. that A, € FS). So, the condition (a) could not have been 
violated. Similarly, it can be shown that the condition (b) could not have been violated, | 
The violation of condition (c) i is then the only possibility. In that case, for sote 
0 €Q, 
(i) exactly one of the two sides of the equation (*) signals an exception, 
(ii) different sides signal different exceptions, or . 
(iit) different sides return different values. 
Using minimality property, we can construct a legal ground term e= of€,.., e ) of type D’, 
where for each 1 <i¢n, e is the ground term whose interpretation is v, in A,. ‘The 
possibilities (i) and (ii) above are ruled out because of the following reasons: 
For both (i) and (ii), the exception signalled by either side must be different from the 
optional exception. Since S is sufficiently complete, either “Nty(e = = T € EQS), or 
‘N? Or F € EQ(S). If ‘N?,,(e)=T € EQ(S), then none of da, and da, can be an 
exception value, ruling out (i) and (ii). If ‘N? pte) = F € EQG), theh ‘ouuisle ext € 
EQ(S) for some ext meaning that 
da = oe da, = = ext, 
again ailing si (i) and (i. 
The only possibility is (iii). ‘Then e must be type D’ € A, as if e is of type D, then 


- 242 - 


the definition of ®,, ensures that the equation (*) is satisfied. We have either 'N?,{e) = T ~ 
€ EQ(S) or neither ‘N?, {e) = T € EQ(S). nor “N?j,(é)= F' € EQ(S). If "N?) 42) & Te 
EQ(S), then there is a ground term ¢ without any operation symbol of D and auxiliary 
functions used in S such that‘ e = e* € EQ(S¥40 Ax, = 4 A; = e| A, ruling out (iii). If 
neither ‘N?,,{e) = T € EQ(S) nor ‘N? Ache F: €. EQUS), then also there exists ‘a ground 
term e without any operation symbol of D and: auxiliary functions used.:in-S such: that. 
‘exe’ € EQSU{ N?,,4¢) = T }), which again rules out earn of the reasons 
similar to the ones discussed above. : ye se 

The above thus implies that A, is partially ipomopphically embeddable in. A. 

Hence the result. 5 


Fhm. 413 For a consistent and sufficiently: Henle. if any two etic ground te terms ie 
and ¢, one ee eneunne > then‘e# ¢:" CDE) : 


Proof: é and é, are distinguishable by s, means that for any | cs € FS), eli A and el a are 
distinguishable, ie., 

(a) e| , isan exception value and € A | aris is a normal value, 

(b) e] , and ely: are distinguishable exception Values, or = 

(c) el 4 and e,| ,- are normal values and there exists aterm © c(x) of type De A ) uf D} 
with c one e free variable x of type D such that efx/ v) , \ is distinguishable f from elx/ v4) 1, ai ‘in 
A. 
Since S is 5 sufficiently complete, itc can be shown that if 

(i) a ground term e ‘interprets to an exception value i in p every algebra A € FS) then 
N2y(é) = F € EQ(S), and also ‘ 
&) if ‘ e interprets to a normal value i in every algebra A € FS), then Niy@=T Tr € 


- 243 - 


Using the above facts, we prove the theorem by induction on specifications, 


Basis Specifications with no defining tes, 
Case 1 Bool 
‘T 4 F € DS(Bool). Every ground term of type Beolé Ss eee to either T or F, so 
the theorem holds. 
Case 2 D other than Bool 
Subcase 1 S does not specify any operation to signal, 
All ground terms are observable equivalent, so the theorem helds: 
Subcase 2 S specifies operations to signal - 
Assume e,-and.e, are distinguishable by S, so there. is one of the above three 
possibilities. We show in each case how ‘e, # e, can be derived in DS(S). | 
(a) Since S is sufficiently complete, ‘N?, fe) =F’ € EQ(S) and."N?,(e,) = Te 
EQ(S), and by the axiom (vii) in Subsection 4.3.3, ‘e, #0, € DS(S). 
(b) by sufficient cofnpleteness of S, usirig ‘the axiom (vi) in Subsection 4.3.3 and 
repeatedly using the argument in case 2, we get ‘e, # 2, *€ DSS). a 
(c) By the substitution property of the operations, and the sufficient 
completeness of S, we get ‘ e, #e, € DS(S), by the method of proof by contradiction. i 


Inductive Step Assume the above statement for the speci fication S’ of a data type D’ used 
inthe specification S of D. To show for S. | 
Assume e, and é, are distinguishable by S. For the possibilities (a) and (b), the argument 
used in the basis step applies. For the third possibility, i in addition to the case considered in 
the basis step, we have the case when the interpretations of eg, and é, are distinguishable i in 
A because of a computation e(x) returning distinguishable results of type D’ € A. For this 
case also, we can prove by contradiction that ‘ e# e, € DS(S) as follows: 
"Assume e, = €, | | 
then e[x/e] = ¢[x/e], (*) 
We have three subcases: 
Subcase | Both sides of (*) interpret to a normal value in A. 
Since S is sufficiently complete, there. exists: ground: terms e, and ¢, of type D’ 
such that e,, e, do not have any occurrence of an aperation symbol of D, and ‘e, = e,’,. 


 -244- 


"e,= 6," € EQS), so we have ‘e’ = e € EQ(S). Since €,, €, are distinguishable by S’, by | 
inductive hypothesis, ‘e # ¢,” € DSS), soe £ €," is: also in DSS). This is a 
contradiction, as S is consistent. So, ‘e, # e,’ € DS(S). me 
Subcase 2 One of the two. sides of (*) interprets to:a normal.value. | 
Without any loss of generality, assume I.h.s. interprets to a normal value. By 
sufficient completeness of S, there is a e, such that ei =e) €: EQ(S), and there is an 
exception ground term ext such that ‘e, signals ext’ €: EQS). so again, we have using the 
axioms, e # e, € DSS). 2 

| Subcase 3 Both sides of (*) interpret to distinguishable exception values. 

- Using the sufficient completeness of S, we can show using:a Similar aa that 
"e, #6, € DS(S). 


Hence the theorem. 8 


3. . Specifications with Exceptional Behavior and 
Nondeterminism 


| Thm, 4.14 fand TR(f) are semantically equivalent. 


Proof By induction on structure of f/ We only need to show the basis step; the inductive 
step is straightforward because the symbols ~, V, and v have the same, interpretation. So, 
we have fas ‘e, = €,. " Consider an extended type algebra Aof Din which Tf: and TR(f) can 
be interpreted (i. e., A has an interpretation for every nondeterministic operation symbol o 
and the corresponding auxiliary function symbol oP such that the interpretation of the 
auxiliary function is the relation computed by the interpretation of the nondeterministic 


operation symbol). 


Case (a). fdoes not have any occurrence of a nondeterministic operation symbol. 
TR(f) = f, so the statement trivially holds. 
‘Case (b). Both e, and e, have occurrences of nondeterministic symbols: - 
It is obvious from the description of the procedure TR in Subsection 4.4.1 that the 
interpretation of ‘e, = e,’ is equivalent to the interpretation of TR(/). 


Case (c) Exactly one of cA and e, has occurrences of nondeterministic symbols: Again from 


- 245 - 


the description of TR in Subsection 4.4.1, the interpretation of *e, = e, Is equivalent to the 


2 
interpretation of TR(/). # 


ey 246 ~ 
Appendix IV - Specifications of Data Types used in Chapter 5 


In this appendix, we give specifications of the data types Null, 
Struct [7,: D,,...,,: D,], Oneof[n,: D,,..., 2,: D,], and Sequence-Int used in Chapter 5. 
Struct, {nd Oneof are type schema. Below,. we specify an instance of these schema 
assuming fixed but unspecified parameters, i.e., k as well as D,. . D, are fixed. Since the 
specification is given for an arbitrary k, we have used the *...” notation. The specification of 
any particular instance, such as Oncof [empty: Null, pair: Pair}, 
Struct [car: Int, cdr: List-Int] used in Chapter 5, can be given without using the “...’ 
notation. ‘es Z 


Figure A4.1. Specification of Null 


Operations 

Nil :— Null 

Equal : Null X Null — Bool as x1 = x2 
Axioms 


Nil = Nil = T 


- 247 - 


Figure A4.2. Specification of Struct [”,: D,,...,1,: Di] 
Struct [”,:D,,...,4,:D,] as D 
Operations 


Create :D,X... xD,- D 
Fetch_n, : D— D, ss 


Fetch_n,: D — D, 
Replace_n, : Dx D, — D 


Replace_n, : Dx D, — D 
Equal :DxD — Bool as x1 =x2 


Axioms 


Fetch_n,(Create(x1, ..., xk)) = x1 


Fetch_n,(Create(x1 peo, XK)) = XK 
Replace_n,(Create(x1, ..., xk), y1) = Create(y1, ..., xk) 


Replace_n,(Create(x1, ..., xk), yk) = Create(x1, ..., yk) 
Create(x1, ..., xk) = Create(y1,..., yk) = (x1 = y1)A...A (xk = yk) 


Figure A4.3, Specification of Oneof[a,: D,,...,4,: DJ 


Oneof [”,:D,,...,",: D,] as D 
Operations 


Make_n, : D, + D 


Make_/, : D, + D 
Value_n, : D = D, 
— wrong-tag 


Value_n, : D — D, 
— wrong-tag 
Is_n, : D - Bool 


Is_n, : D 5 Bool 
Equal :DxD - Bool 


Restrictions 


- 248 - 


~ Is_n,(x) => Value_n,(x) signals wrong-tag 


~ Is_ng&x) => Value_n, (x) signals wrong-tag 


Axioms 


Value_n,(Make_n,(x1)) = x1 


Value_n,(Make_r, (xk)) = xk 
Is_n,(Make_nj(x1)) = T 


Is_n,(Make_7, (xk)) =F 


Is_n,(Make_n,(x1)) = F 


- 249 - 


Is_n(Make_n,(xk)) = T 
Make_”(x1) = Make_njf{y1) = x1 = y1 


Ill 
ar 


Make_/,(x1) = Make_1,(yk) 
Make_n,.(xk) = Make_nj(y1) = F 


Make_1,(xk) = Make_/, (yk) 
X=y=y=x 


xk = yk 


- 250 - 


Figure A4.4, Specification of Sequence-Int 


Sequence-Int as Si 


Operations 
New : — Si 
Addi : SIX Int — SI 
Addh : SUX int — Si _— pots 
Concat : SIXSI — SI as x1°x2 
Subseq : SIX int X int — Si 
— bounds 
-+ negative-size { 
FIN : Int X int — SI 
— negative-size 
Fetch : SIXint — Int as x{i] 
' —+ bounds. BY) Sa Bee 
Bottom : Si — Int 
— bounds 
Top : St — Int 
'  —» bounds 
Remi : Si SI 
— bounds 
Remh : Si — SI 
— bounds 
Size : Si — Int 
Empty : SI — Bool” 
Replace : SIX Int Xint — SI 
. — bounds 
index : SEX int — Int 
— element-not-in 
Member : Si X Int — Bool 
Equal : SIXSi — Bool as x1 = x2 
Restrictions 


(i1< 1 Vil >(Size(s) + 1)) => Subsea({s, i1, i2) signals bounds 
-(~ (11 ¢ 1 Vi > (Size(s) + 1)) A (i2< 0)) > Subsed(s, i1, i2) signals negative-size 


i< O => Fill(i, j) signals negative-size 

(i< 1 V i> Size(s) ) => Fetch(s, i) signals bounds 
Size(s) = 0 => Bottom(s) signals bounds 

Size(s) = 0 = Top(s) signals bounds 

Size(s) = 0 => Remi(s) signals bounds 

Size(s) = 0 = Remh(s) signals bounds 

(i< 1 V i> Size(s) ) => Replace(s, i, j) signals bounds 
~ Member(s, j) => index(s, j) signals element-not-in 


Axioms 


Addi(New, j) = Addh(New, j) 
Addi(Addh(s, ji), j2) = Addh(Addl{s, j2), j1) 


-251- 


s-New=s 

s1+Addh(s2, j) = Addh(s1 - s2, j) 

Subseq(s, i1, 0) = New 

Subseq(AddH(s, j), 11, i2 + 1) = if (i1 + i2)< (Size(s) + 1) then Subseq(s, i1, i2 + 1) 
else if (i1 + i2) = (Size(s) + 1) then Addh(Subsed(s, i1, i2), j) 

else Subseq(AddH(s, j), 11, Size(s) -i1 +2) 

Fiil(O, j) = New an 

Fill(i+ 1, j) = Addh(Fill(i, j), j) 

Fetch(Addn(s, j), i) = ifi = Size(s) + 1 then j else Fetch(s, i) 

Bottom(s) = Fetch(s, 1) 

Top(s) = Fetch(s, Size(s)) 

Remi(s) = Subsed(s, 2, Size(s)-1) 

Remh(s) = Subsedq(s, 1, Size(s)-1) 

Size(New) = O 

Size(Addh(s, j)) = Size(s) + 1 

Empty(New) = T 

Empty(Addh(s, j)) = F 

Member(New, j) = F ; 

Member(Addh(s, j1), j2) = if j1 = j2 then T else Member(s, j2) 

Replace(Addh(s, j1), i, j2) = if i = Size(s) + 1 then Addh(s, j2) else Addh(Replace(s, i, j2), j1) 

Fetch(s, Index(s, j)) = j 

x=x=T 

x=y=y=x 

New = Addh(s, j) = F 

Addh(s1, j1) = Addh(s2, j2) = (j1 = j2) A(s1 = s2) 


